Re: [RS6000] rs6000_rtx_costs reduce cost for SETs

2020-09-21 Thread Alan Modra via Gcc-patches
On Fri, Sep 18, 2020 at 01:13:18PM -0500, Segher Boessenkool wrote:
> Thanks (to both of you).  Interesting!  Which of these unrelated changes
> does this come from?

Most of the changes I saw in code generation (not in spec, I didn't
look there, but in gcc) came down to this change to the cost for SETs,
and "rs6000_rtx_costs multi-insn constants".  I expect they were the
changes that made most difference to spec results, with this patch
likely resulting in more if-conversion.

So here is the patch again, this time without any distracting other
changes.  With a further revised comment.

* config/rs6000/rs6000.c (rs6000_rtx_costs): Reduce cost of SET
operands.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8969baa4dcf..2d770afd8fe 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21599,6 +21599,35 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
}
   break;
 
+case SET:
+  /* On entry the value in *TOTAL is the number of general purpose
+regs being set, multiplied by COSTS_N_INSNS (1).  Handle
+costing of set operands specially since in most cases we have
+an instruction rather than just a piece of RTL and should
+return a cost comparable to insn_cost.  That's a little
+complicated because in some cases the cost of SET operands is
+non-zero, see point 5 above and cost of PLUS for example, and
+in others it is zero, for example for (set (reg) (reg)).
+But (set (reg) (reg)) has the same insn_cost as
+(set (reg) (plus (reg) (reg))).  Hack around this by
+subtracting COSTS_N_INSNS (1) from the operand cost in cases
+were we add at least COSTS_N_INSNS (1) for some operation.
+However, don't do so for constants.  Constants might cost
+more than zero when they require more than one instruction,
+and we do want the cost of extra instructions.  */
+  {
+   rtx_code src_code = GET_CODE (SET_SRC (x));
+   if (src_code == CONST_INT
+   || src_code == CONST_DOUBLE
+   || src_code == CONST_WIDE_INT)
+ return false;
+   int set_cost = (rtx_cost (SET_SRC (x), mode, SET, 1, speed)
+   + rtx_cost (SET_DEST (x), mode, SET, 0, speed));
+   if (set_cost >= COSTS_N_INSNS (1))
+ *total += set_cost - COSTS_N_INSNS (1);
+   return true;
+  }
+
 default:
   break;
 }

-- 
Alan Modra
Australia Development Lab, IBM


Re: New modref/ipa_modref optimization passes

2020-09-21 Thread Richard Biener
On Sun, 20 Sep 2020, Jan Hubicka wrote:

> Hi,
> this is patch I am using to fix the assumed_alias_type.f90 failure by
> simply arranging alias set 0 for the problematic array descriptor.

There's no such named testcase on trunk.  Can you be more specific
as to the problem at hand?  It looks like gfortran.dg/assumed_type_9.f90
execute FAILs at the moment.

In particular how's this not an issue w/o IPA modref?

For TYPE(*) I think the object itself cannot be accessed but for
arrays the meta-info in the array descriptor can.  Now my question
would be why the Fortran FE at the call site does not build an
appropriately typed array descriptor?

CCing the fortran list.

> I am not sure this is the best option, but I suppose it is better than
> setting all array descritors to have same canonical type (as done by
> LTO)?
> 
> Honza
> 
>   * trans-types.c (gfc_get_array_type_bounds): Set alias set to 0 for
>   arrays of unknown element type.
> diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
> index 26fdb2803a7..bef3d270c06 100644
> --- a/gcc/fortran/trans-types.c
> +++ b/gcc/fortran/trans-types.c
> @@ -1903,6 +1903,12 @@ gfc_get_array_type_bounds (tree etype, int dimen, int 
> codimen, tree * lbound,
>base_type = gfc_get_array_descriptor_base (dimen, codimen, false);
>TYPE_CANONICAL (fat_type) = base_type;
>TYPE_STUB_DECL (fat_type) = TYPE_STUB_DECL (base_type);
> +  /* Arrays of unknown type must alias with all array descriptors.  */
> +  if (etype == ptr_type_node)
> +{
> +  TYPE_ALIAS_SET (base_type) = 0;
> +  TYPE_ALIAS_SET (fat_type) = 0;
> +}
>  
>tmp = TYPE_NAME (etype);
>if (tmp && TREE_CODE (tmp) == TYPE_DECL)
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [PATCH] CSE negated multiplications and divisions

2020-09-21 Thread Richard Biener
On Fri, 18 Sep 2020, Segher Boessenkool wrote:

> Hi!
> 
> On Thu, Sep 17, 2020 at 01:20:35PM +0200, Richard Biener wrote:
> > This adds the capability to look for available negated multiplications
> > and divisions, replacing them with cheaper negates.
> 
> It is longer latency than the original insns.

If there's sufficient compute resources yes, it might be.  But only
if the un-CSEd instructions are close together.  If the first multiply
is already done the negate will be faster than another multiply.

> Combine will try to undo
> this, because of that (it depends on the insn costs if that can
> succeed).

That's fine I guess.

> On gimple it is always cheaper, of course.

Yep, and we'll also hope of followup transform that will eat the
negate and combine it with a followup transform.

Richard.


Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Richard Sandiford
Qing Zhao  writes:
> Hi, Richard,
>
> During my implementation of the new version of the patch. I still feel that 
> it’s not practical to add a default definition in the middle end to just use 
> move patterns to zero each selected register. 
>
> The major issues are:
>
> There are some target specific information on how to define “general 
> register” set and “all register” set,  we have to add a new specific target 
> hook to get such target specific information and pass to middle-end. 

GENERAL_REGS and ALL_REGS are already concepts that target-independent
code knows about though.  I think the non-fixed subsets of those would
make good starting sets, which the target could whittle down it wanted
or needed to.

> For example, on X86, for CALL_USED_REGISTERS, we have:
>
> #define CALL_USED_REGISTERS \
> /*ax,dx,cx,bx,si,di,bp,sp,st,st1,st2,st3,st4,st5,st6,st7*/  \
> {  1, 1, 1, 0, 4, 4, 0, 1, 1,  1,  1,  1,  1,  1,  1,  1,   \
> /*arg,flags,fpsr,frame*/\
> 1,   1,1,1, \
> /*xmm0,xmm1,xmm2,xmm3,xmm4,xmm5,xmm6,xmm7*/ \
>  1,   1,   1,   1,   1,   1,   6,   6,  \
> /* mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7*/   
>
>
> From the above, we can see “st0 to st7” are call_used_registers for x86, 
> however, we should not zero these registers on x86. 
>
> Such details is only known by x86 backend. 
>
> I guess that other platforms might have similar issue. 

They might, but that doesn't disprove that there's a sensisble default
choice that works for most targets.

FWIW, stack registers themselves are already exposed outside targets
(see reg-stack.c, although since x86 is the only port that uses it,
the main part of it is effectively target-dependent at the moment).
Similarly for register windows.

> If we still want  a default definition in middle end to generate the zeroing 
> insn for selected registers, I have to add another target hook, say, 
> “ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)” to check whether a register should 
> be zeroed based on gpr_only (general register only)  and target specific 
> decision.   I will provide a x86 implementation for this target hook in this 
> patch. 
>
> Other targets have to implement this new target hook to utilize the default 
> handler. 
>
> Let me know your opinion:
>
> A.  Will not provide default definition in middle end to generate the zeroing 
> insn for selected registers.  Move the generation work all to target; X86 
> implementation will be provided;
>
> OR:
>
> B.  Will provide a default definition in middle end to generate the zeroing 
> insn for selected registers. Then need to add a new target hook 
> “ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)”, same as A, X86 implementation will 
> be provided in my patch. 

The kind of target hook interface I was thinking of was:

  HARD_REG_SET TARGET_EMIT_MOVE_ZEROS (const HARD_REG_SET ®s)

which:

- emits zeroing instructions for some target-specific subset of REGS

- returns the set of registers that were actually cleared

The default implementation would clear all registers in REGS,
using reg_raw_mode[R] as the mode for register R.  Targets could
then override the hook and:

- drop registers that shouldn't be cleared

- handle some or all of the remaining registers in a more optimal,
  target-specific way

The targets could then use the default implementation of the hook
to handle any residue.  E.g. the default implementation would be
able to handle general registers on x86.

Thanks,
Richard


Re: [PATCH] vect/test: Don't check for epilogue loop [PR97075]

2020-09-21 Thread Andrea Corallo
Richard Sandiford  writes:
[...]
> Andrea, how should we handle this?  Is it something you'd have time to
> look at?

Hi Richard,

I've not but FWIW your observations here and on today's mail make alot
of sense to me.  We maybe want to install Kewen's fix anyway while we
rework this logic?

  Andrea


Re: New modref/ipa_modref optimization passes

2020-09-21 Thread Jan Hubicka
> On Sun, 20 Sep 2020, Jan Hubicka wrote:
> 
> > Hi,
> > this is patch I am using to fix the assumed_alias_type.f90 failure by
> > simply arranging alias set 0 for the problematic array descriptor.
> 
> There's no such named testcase on trunk.  Can you be more specific
> as to the problem at hand?  It looks like gfortran.dg/assumed_type_9.f90
> execute FAILs at the moment.
> 
> In particular how's this not an issue w/o IPA modref?

> 
> For TYPE(*) I think the object itself cannot be accessed but for
> arrays the meta-info in the array descriptor can.  Now my question
> would be why the Fortran FE at the call site does not build an
> appropriately typed array descriptor?
> 
> CCing the fortran list.

The problem is:

alsize (struct array15_unknown & restrict a)
{
...
  _2 = *a_13(D).dtype.rank;
  _3 = (integer(kind=8)) _2;
...
}
}
and in main:

  struct array02_integer(kind=4) am;
   :
  MEM  [(struct dtype_type *)&am + 24B] = {};
  am.dtype.elem_len = 4;
  am.dtype.rank = 2;
  am.dtype.type = 1;
...
  _52 = alsize (&am);

Here array15_unknown and array02_integer are different structures with
different canonical types and thus we end up disambiguating the accesses
via base alias sets.

My understanding is that this _unknown array descriptor is supposed to
be universal and work with all kinds of arrays.

Wihtout modref this works because alsize is not inlined (we think code
size would grow). Forcing inliner to inline stil leads to working code
because we first constant propagate the pointer and then we see accesses
from same base DECL thus bypass the TBAA checks.  Disabling the
constant propagation leads to wrong code as wel.

Honza


Re: New modref/ipa_modref optimization passes

2020-09-21 Thread Richard Biener
On Mon, 21 Sep 2020, Jan Hubicka wrote:

> > On Sun, 20 Sep 2020, Jan Hubicka wrote:
> > 
> > > Hi,
> > > this is patch I am using to fix the assumed_alias_type.f90 failure by
> > > simply arranging alias set 0 for the problematic array descriptor.
> > 
> > There's no such named testcase on trunk.  Can you be more specific
> > as to the problem at hand?  It looks like gfortran.dg/assumed_type_9.f90
> > execute FAILs at the moment.
> > 
> > In particular how's this not an issue w/o IPA modref?
> 
> > 
> > For TYPE(*) I think the object itself cannot be accessed but for
> > arrays the meta-info in the array descriptor can.  Now my question
> > would be why the Fortran FE at the call site does not build an
> > appropriately typed array descriptor?
> > 
> > CCing the fortran list.
> 
> The problem is:
> 
> alsize (struct array15_unknown & restrict a)
> {
> ...
>   _2 = *a_13(D).dtype.rank;
>   _3 = (integer(kind=8)) _2;
> ...
> }
> }
> and in main:
> 
>   struct array02_integer(kind=4) am;
>:
>   MEM  [(struct dtype_type *)&am + 24B] = {};
>   am.dtype.elem_len = 4;
>   am.dtype.rank = 2;
>   am.dtype.type = 1;
> ...
>   _52 = alsize (&am);
> 
> Here array15_unknown and array02_integer are different structures with
> different canonical types and thus we end up disambiguating the accesses
> via base alias sets.
> 
> My understanding is that this _unknown array descriptor is supposed to
> be universal and work with all kinds of arrays.

But the FE builds a new descriptor for each individual call and thus
should build a universal descriptor for a call to an universal
descriptor argument.

Richard.

> Wihtout modref this works because alsize is not inlined (we think code
> size would grow). Forcing inliner to inline stil leads to working code
> because we first constant propagate the pointer and then we see accesses
> from same base DECL thus bypass the TBAA checks.  Disabling the
> constant propagation leads to wrong code as wel.
> 
> Honza
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [COMMITTED] config: Sync largefile.m4 from binutils-gdb

2020-09-21 Thread Martin Liška

On 9/8/20 11:22 AM, Rainer Orth wrote:

How am I supposed to install a ChangeLog entry like the one below?  The
format is analogous to the one used for backports.  Martin?]


Hello.

Currently we do not allow a changelog entry that prepends file changes.
You can put the:
'''
Sync from binutils-gdb.
2020-07-30  Rainer Orth  
'''

just to the commit message.

Martin


Re: New modref/ipa_modref optimization passes

2020-09-21 Thread Jan Hubicka
> > 
> > The problem is:
> > 
> > alsize (struct array15_unknown & restrict a)
> > {
> > ...
> >   _2 = *a_13(D).dtype.rank;
> >   _3 = (integer(kind=8)) _2;
> > ...
> > }
> > }
> > and in main:
> > 
> >   struct array02_integer(kind=4) am;
> >:
> >   MEM  [(struct dtype_type *)&am + 24B] = {};
> >   am.dtype.elem_len = 4;
> >   am.dtype.rank = 2;
> >   am.dtype.type = 1;
> > ...
> >   _52 = alsize (&am);
> > 
> > Here array15_unknown and array02_integer are different structures with
> > different canonical types and thus we end up disambiguating the accesses
> > via base alias sets.
> > 
> > My understanding is that this _unknown array descriptor is supposed to
> > be universal and work with all kinds of arrays.
> 
> But the FE builds a new descriptor for each individual call and thus
> should build a universal descriptor for a call to an universal
> descriptor argument.

I see, so you would expect call to alsize to initialize things in
array15_unkonwn type?  That would work too.

Honza


Re: [RFC Patch] mklog.py: Parse first 10 lines for PR/DR number

2020-09-21 Thread Martin Liška

On 9/8/20 6:20 PM, Tobias Burnus wrote:

However, the new version of the patch stops after the first
'dg-error/dg-warning'.


LGTM.

Martin


Fix sse2-andnpd-1.c, avx-vandnps-1.c and sse-andnps-1.c testscases

2020-09-21 Thread Jan Hubicka
Hi,
these testcases now fails because they contains an invalid type puning
that happens via const VALUE_TYPE *v pointer. Since the check function
is noinline, modref is needed to trigger the wrong code.
I think it is easiest to fix it by no-strict-aliasing.

Regtested x86_64-linux, OK?

* gcc.target/i386/m128-check.h: Add no-strict aliasing to
CHECK_EXP macro.

diff --git a/gcc/testsuite/gcc.target/i386/m128-check.h 
b/gcc/testsuite/gcc.target/i386/m128-check.h
index 48b23328539..6f414b07be7 100644
--- a/gcc/testsuite/gcc.target/i386/m128-check.h
+++ b/gcc/testsuite/gcc.target/i386/m128-check.h
@@ -78,6 +78,7 @@ typedef union
 
 #define CHECK_EXP(UINON_TYPE, VALUE_TYPE, FMT) \
 static int \
+__attribute__((optimize ("no-strict-aliasing")))   \
 __attribute__((noinline, unused))  \
 check_##UINON_TYPE (UINON_TYPE u, const VALUE_TYPE *v) \
 {  \


Re: New modref/ipa_modref optimization passes

2020-09-21 Thread Richard Biener
On Mon, 21 Sep 2020, Jan Hubicka wrote:

> > > 
> > > The problem is:
> > > 
> > > alsize (struct array15_unknown & restrict a)
> > > {
> > > ...
> > >   _2 = *a_13(D).dtype.rank;
> > >   _3 = (integer(kind=8)) _2;
> > > ...
> > > }
> > > }
> > > and in main:
> > > 
> > >   struct array02_integer(kind=4) am;
> > >:
> > >   MEM  [(struct dtype_type *)&am + 24B] = {};
> > >   am.dtype.elem_len = 4;
> > >   am.dtype.rank = 2;
> > >   am.dtype.type = 1;
> > > ...
> > >   _52 = alsize (&am);
> > > 
> > > Here array15_unknown and array02_integer are different structures with
> > > different canonical types and thus we end up disambiguating the accesses
> > > via base alias sets.
> > > 
> > > My understanding is that this _unknown array descriptor is supposed to
> > > be universal and work with all kinds of arrays.
> > 
> > But the FE builds a new descriptor for each individual call and thus
> > should build a universal descriptor for a call to an universal
> > descriptor argument.
> 
> I see, so you would expect call to alsize to initialize things in
> array15_unkonwn type?  That would work too.

Yes, that's my expectation.  But let's see what fortran folks say.

Richard.


Re: Fix sse2-andnpd-1.c, avx-vandnps-1.c and sse-andnps-1.c testscases

2020-09-21 Thread Richard Biener
On Mon, 21 Sep 2020, Jan Hubicka wrote:

> Hi,
> these testcases now fails because they contains an invalid type puning
> that happens via const VALUE_TYPE *v pointer. Since the check function
> is noinline, modref is needed to trigger the wrong code.
> I think it is easiest to fix it by no-strict-aliasing.
> 
> Regtested x86_64-linux, OK?

OK.

>   * gcc.target/i386/m128-check.h: Add no-strict aliasing to
>   CHECK_EXP macro.
> 
> diff --git a/gcc/testsuite/gcc.target/i386/m128-check.h 
> b/gcc/testsuite/gcc.target/i386/m128-check.h
> index 48b23328539..6f414b07be7 100644
> --- a/gcc/testsuite/gcc.target/i386/m128-check.h
> +++ b/gcc/testsuite/gcc.target/i386/m128-check.h
> @@ -78,6 +78,7 @@ typedef union
>  
>  #define CHECK_EXP(UINON_TYPE, VALUE_TYPE, FMT)   \
>  static int   \
> +__attribute__((optimize ("no-strict-aliasing"))) \
>  __attribute__((noinline, unused))\
>  check_##UINON_TYPE (UINON_TYPE u, const VALUE_TYPE *v)   \
>  {\
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [PATCH v2 1/2] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR

2020-09-21 Thread Richard Biener via Gcc-patches
On Fri, Sep 18, 2020 at 8:18 AM Xiong Hu Luo  wrote:
>
> This patch enables transformation from ARRAY_REF(VIEW_CONVERT_EXPR) to
> VEC_SET internal function in gimple-isel pass if target supports
> vec_set with variable index by checking can_vec_set_var_idx_p.
>
> gcc/ChangeLog:
>
> 2020-09-18  Xionghu Luo  
>
> * gimple-isel.cc (gimple_expand_vec_set_expr): New function.
> (gimple_expand_vec_cond_exprs): Call gimple_expand_vec_set_expr.
> * internal-fn.c (vec_set_direct): New define.
> (expand_vec_set_optab_fn): New function.
> (direct_vec_set_optab_supported_p): New define.
> * internal-fn.def (VEC_SET): New DEF_INTERNAL_OPTAB_FN.
> * optabs.c (can_vec_set_var_idx_p): New function.
> * optabs.h (can_vec_set_var_idx_p): New declare.
> ---
>  gcc/gimple-isel.cc  | 116 +++-
>  gcc/internal-fn.c   |  36 ++
>  gcc/internal-fn.def |   2 +
>  gcc/optabs.c|  17 +++
>  gcc/optabs.h|   3 ++
>  5 files changed, 172 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> index b330cf4c20e..bc61e2895be 100644
> --- a/gcc/gimple-isel.cc
> +++ b/gcc/gimple-isel.cc
> @@ -35,6 +35,80 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-cfg.h"
>  #include "bitmap.h"
>  #include "tree-ssa-dce.h"
> +#include "fold-const.h"
> +#include "gimple-fold.h"
> +#include "memmodel.h"
> +#include "optabs.h"
> +
> +/* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to
> +   internal function based on vector type of selected expansion.
> +   i.e.:
> + VIEW_CONVERT_EXPR(u)[_1] =  = i_4(D);
> +   =>
> + _7 = u;
> + _8 = .VEC_SET (_7, i_4(D), _1);
> + u = _8;  */
> +
> +static gimple *
> +gimple_expand_vec_set_expr (gimple_stmt_iterator *gsi)
> +{
> +  enum tree_code code;
> +  gcall *new_stmt = NULL;
> +  gassign *ass_stmt = NULL;
> +
> +  /* Only consider code == GIMPLE_ASSIGN.  */
> +  gassign *stmt = dyn_cast (gsi_stmt (*gsi));
> +  if (!stmt)
> +return NULL;
> +
> +  code = TREE_CODE (gimple_assign_lhs (stmt));

do the lhs = gimple_assign_lhs (stmt) before and elide cond,
putting the TREE_CODE into the if below.

> +  if (code != ARRAY_REF)
> +return NULL;
> +
> +  tree lhs = gimple_assign_lhs (stmt);
> +  tree val = gimple_assign_rhs1 (stmt);
> +
> +  tree type = TREE_TYPE (lhs);
> +  tree op0 = TREE_OPERAND (lhs, 0);
> +  if (TREE_CODE (op0) == VIEW_CONVERT_EXPR

So I think we want to have an exact structural match first here, so

  if (TREE_CODE (op0) == VIEW_CONVERT_EXPR
  && DECL_P (TREE_OPERAND (op0, 0))
  && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
  && TYPE_MODE (TREE_TYPE (lhs)) == TYPE_MODE (TREE_TYPE
(TREE_TYPE (TREE_OPERAND (op0, 0

which means we're sure to do an element extract from a vector type
(and we know all vector types have sane element types).


> +  && tree_fits_uhwi_p (TYPE_SIZE (type)))
> +{
> +  tree pos = TREE_OPERAND (lhs, 1);
> +  tree view_op0 = TREE_OPERAND (op0, 0);
> +  machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
> +  scalar_mode innermode = GET_MODE_INNER (outermode);
> +  tree_code code = TREE_CODE (TREE_TYPE(view_op0));
> +  if (!is_global_var (view_op0) && code == VECTOR_TYPE
> + && tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (view_op0)))

why did you need those TYPE_SIZE checks?  As said earlier
you want !TREE_ADDRESSABLE (view_op0) and eventually
the stronger auto_var_in_fn_p (view_op0, cfun) rather than !is_global_var.

> + && can_vec_set_var_idx_p (code, outermode, innermode,
> +   TYPE_MODE (TREE_TYPE (pos
> +   {
> + location_t loc = gimple_location (stmt);
> + tree var_src = make_ssa_name (TREE_TYPE (view_op0));
> + tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
> +
> + ass_stmt = gimple_build_assign (var_src, view_op0);
> + gimple_set_vuse (ass_stmt, gimple_vuse (stmt));
> + gimple_set_location (ass_stmt, loc);
> + gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
> +
> + new_stmt
> +   = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
> + gimple_call_set_lhs (new_stmt, var_dst);
> + gimple_set_location (new_stmt, loc);
> + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
> +
> + ass_stmt = gimple_build_assign (view_op0, var_dst);
> + gimple_set_location (ass_stmt, loc);
> + gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
> +
> + gimple_move_vops (ass_stmt, stmt);
> + gsi_remove (gsi, true);
> +   }
> +}
> +
> +  return ass_stmt;
> +}
>
>  /* Expand all VEC_COND_EXPR gimple assignments into calls to internal
> function based on type of selected expansion.  */
> @@ -187,8 +261,25 @@ gimple_expand_vec_cond_exprs (void)
>  {
>for (gsi = gsi_start_bb (bb); !gsi_e

Re: [PATCH] vect/test: Don't check for epilogue loop [PR97075]

2020-09-21 Thread Richard Sandiford
Andrea Corallo  writes:
> Richard Sandiford  writes:
> [...]
>> Andrea, how should we handle this?  Is it something you'd have time to
>> look at?
>
> Hi Richard,
>
> I've not

OK, NP.  In that case I'll give it a go.

> but FWIW your observations here and on today's mail make alot
> of sense to me.  We maybe want to install Kewen's fix anyway while we
> rework this logic?

I think that would just be trading one problem for another though.

I'll try to have a patch ready tomorrow morning European time.

Thanks,
Richard


Re: [PATCH V2] aarch64: Fix ICE on fpsr fpcr getters [PR96968]

2020-09-21 Thread Andrea Corallo
Richard Sandiford  writes:

> Richard Sandiford  writes:
>>> @@ -2034,6 +2034,18 @@ aarch64_expand_fpsr_fpcr_setter (int unspec, 
>>> machine_mode mode, tree exp)
>>>emit_insn (gen_aarch64_set (unspec, mode, op));
>>>  }
>>>  
>>> +/* Expand a fpsr or fpcr getter (depending on UNSPEC) using MODE.
>>> +   Return the target.  */
>>> +static rtx
>>> +aarch64_expand_fpsr_fpcr_getter (enum insn_code icode, machine_mode mode,
>>> +rtx target)
>>> +{
>>> +  expand_operand op;
>>> +  create_output_operand (&op, target, mode);
>>> +  expand_insn (icode, 1, &op);
>>> +  return target;
>>
>> This needs to be:
>>
>>   return op[0].value;
>
> Er, of course I mean op.value.  Muscle memory, sorry. :-)
>
>>
>> so that we use whatever target the expand machinery chose.
>>
>> OK with that change, thanks.
>>
>> Richard

Installed in trunk as f5e73de00e9.

Thanks!

  Andrea


Re: [PATCH] Add if-chain to switch conversion pass.

2020-09-21 Thread Martin Liška

PING^1

On 9/2/20 1:53 PM, Martin Liška wrote:

On 9/1/20 4:50 PM, David Malcolm wrote:

Hope this is constructive
Dave


Thank you David. All of them very very useful!

There's updated version of the patch.
Martin




[r11-3308 Regression] FAIL: gcc.target/i386/avx-vandnps-1.c execution test on Linux/x86_64 (-m64)

2020-09-21 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

d119f34c952f8718fdbabc63e2f369a16e92fa07 is the first bad commit
commit d119f34c952f8718fdbabc63e2f369a16e92fa07
Author: Jan Hubicka 
Date:   Sun Sep 20 07:25:16 2020 +0200

New modref/ipa_modref optimization passes

caused

FAIL: gcc.target/i386/avx-vandnps-1.c execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3308/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/avx-vandnps-1.c 
--target_board='unix{-m64}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[r11-3308 Regression] FAIL: gcc.target/i386/avx-vandnpd-1.c execution test on Linux/x86_64 (-m64)

2020-09-21 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

d119f34c952f8718fdbabc63e2f369a16e92fa07 is the first bad commit
commit d119f34c952f8718fdbabc63e2f369a16e92fa07
Author: Jan Hubicka 
Date:   Sun Sep 20 07:25:16 2020 +0200

New modref/ipa_modref optimization passes

caused

FAIL: gcc.target/i386/avx-vandnpd-1.c execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3308/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/avx-vandnpd-1.c 
--target_board='unix{-m64}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[PATCH] aarch64: Do not alter value on a force_reg returned rtx expanding __jcvt

2020-09-21 Thread Andrea Corallo
Hi all,

>From the `force_reg` description comment I see the returned register
should not be modified, thus IIUC should not be used as a GEN_FCN
target.

Assuming my interpretation is correct this fix this case inside
`aarch64_general_expand_builtin` while expanding expanding the
`__jcvt` intrinsic.  If is not the case please discard.

Regtested and bootsraped on aarch64-linux-gnu.

  Andrea

>From 403ad66b8f9c108d7f38b406ed1afcb603b7e25f Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 17 Sep 2020 17:17:52 +0100
Subject: [PATCH] aarch64: Do not alter value on a force_reg returned rtx
 expanding __jcvt

2020-09-17  Andrea Corallo  

* config/aarch64/aarch64-builtins.c
(aarch64_general_expand_builtin): Use expand machinery not to
alter the value of an rtx returned by force_reg.
---
 gcc/config/aarch64/aarch64-builtins.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 4f33dd936c7..b787719cf5e 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -2128,14 +2128,14 @@ aarch64_general_expand_builtin (unsigned int fcode, 
tree exp, rtx target,
   return target;
 
 case AARCH64_JSCVT:
-  arg0 = CALL_EXPR_ARG (exp, 0);
-  op0 = force_reg (DFmode, expand_normal (arg0));
-  if (!target)
-   target = gen_reg_rtx (SImode);
-  else
-   target = force_reg (SImode, target);
-  emit_insn (GEN_FCN (CODE_FOR_aarch64_fjcvtzs) (target, op0));
-  return target;
+  {
+   expand_operand ops[2];
+   create_output_operand (&ops[0], target, SImode);
+   op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
+   create_input_operand (&ops[1], op0, DFmode);
+   expand_insn (CODE_FOR_aarch64_fjcvtzs, 2, ops);
+   return ops[0].value;
+  }
 
 case AARCH64_SIMD_BUILTIN_FCMLA_LANEQ0_V2SF:
 case AARCH64_SIMD_BUILTIN_FCMLA_LANEQ90_V2SF:
-- 
2.17.1



[PATCH] POLY_INT_CST: remove extra space in dump

2020-09-21 Thread Martin Liška

Installing as obvious.

Before:

(gdb) p debug_tree(m_index_expr)
 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x775ec7e0 
precision:64 min  max 
pointer_to_this >
constant
elt0:   
constant 2> elt1:  >

After:

(gdb) p debug_tree(m_index_expr )
 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x775ec7e0 
precision:64 min  max 
pointer_to_this >
constant
elt0:  
constant 2> elt1: >

Thanks,
Martin

gcc/ChangeLog:

* print-tree.c (print_node): Remove extra space.
---
 gcc/print-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/print-tree.c b/gcc/print-tree.c
index 2a9c98ea7a0..d1150e472d5 100644
--- a/gcc/print-tree.c
+++ b/gcc/print-tree.c
@@ -851,7 +851,7 @@ print_node (FILE *file, const char *prefix, tree node, int 
indent,
char buf[10];
for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
  {
-   snprintf (buf, sizeof (buf), "elt%u: ", i);
+   snprintf (buf, sizeof (buf), "elt%u:", i);
print_node (file, buf, POLY_INT_CST_COEFF (node, i),
indent + 4);
  }
--
2.28.0



[PATCH] Fix ICE in tree-switch-conversion.

2020-09-21 Thread Martin Liška

With SVE we can end up with:
switch (POLY_INT_CST [2, 2])  [INV], case 2:  [INV], case 4: 
 [INV]>
which is fine to expand and we can remove the assert.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR tree-optimization/96915
* tree-switch-conversion.c (switch_conversion::expand): Accept
also integer constants.

gcc/testsuite/ChangeLog:

PR tree-optimization/96915
* gcc.target/aarch64/sve/pr96915.c: New test.
---
 gcc/testsuite/gcc.target/aarch64/sve/pr96915.c | 11 +++
 gcc/tree-switch-conversion.c   |  3 ---
 2 files changed, 11 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr96915.c

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr96915.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr96915.c
new file mode 100644
index 000..fae4cd42117
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr96915.c
@@ -0,0 +1,11 @@
+/* PR tree-optimization/96915 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.2-a+sve" } */
+
+#pragma GCC aarch64 "arm_sve.h"
+void b() {
+  switch (svcntd())
+  case 2:
+  case 4:
+b();
+}
diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 4b435941d12..186411ff3c4 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -984,9 +984,6 @@ switch_conversion::expand (gswitch *swtch)
  during gimplification).  */
   gcc_checking_assert (TREE_TYPE (m_index_expr) != error_mark_node);
 
-  /* A switch on a constant should have been optimized in tree-cfg-cleanup.  */

-  gcc_checking_assert (!TREE_CONSTANT (m_index_expr));
-
   /* Prefer bit test if possible.  */
   if (tree_fits_uhwi_p (m_range_size)
   && bit_test_cluster::can_be_handled (tree_to_uhwi (m_range_size), m_uniq)
--
2.28.0



Re: [PATCH] aarch64: Do not alter value on a force_reg returned rtx expanding __jcvt

2020-09-21 Thread Richard Sandiford
Andrea Corallo  writes:
> Hi all,
>
> From the `force_reg` description comment I see the returned register
> should not be modified, thus IIUC should not be used as a GEN_FCN
> target.
>
> Assuming my interpretation is correct this fix this case inside
> `aarch64_general_expand_builtin` while expanding expanding the
> `__jcvt` intrinsic.  If is not the case please discard.

Good catch.

> Regtested and bootsraped on aarch64-linux-gnu.
>
>   Andrea
>
> From 403ad66b8f9c108d7f38b406ed1afcb603b7e25f Mon Sep 17 00:00:00 2001
> From: Andrea Corallo 
> Date: Thu, 17 Sep 2020 17:17:52 +0100
> Subject: [PATCH] aarch64: Do not alter value on a force_reg returned rtx
>  expanding __jcvt
>
> 2020-09-17  Andrea Corallo  
>
>   * config/aarch64/aarch64-builtins.c
>   (aarch64_general_expand_builtin): Use expand machinery not to
>   alter the value of an rtx returned by force_reg.

OK, thanks.

Richard

> ---
>  gcc/config/aarch64/aarch64-builtins.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index 4f33dd936c7..b787719cf5e 100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -2128,14 +2128,14 @@ aarch64_general_expand_builtin (unsigned int fcode, 
> tree exp, rtx target,
>return target;
>  
>  case AARCH64_JSCVT:
> -  arg0 = CALL_EXPR_ARG (exp, 0);
> -  op0 = force_reg (DFmode, expand_normal (arg0));
> -  if (!target)
> - target = gen_reg_rtx (SImode);
> -  else
> - target = force_reg (SImode, target);
> -  emit_insn (GEN_FCN (CODE_FOR_aarch64_fjcvtzs) (target, op0));
> -  return target;
> +  {
> + expand_operand ops[2];
> + create_output_operand (&ops[0], target, SImode);
> + op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
> + create_input_operand (&ops[1], op0, DFmode);
> + expand_insn (CODE_FOR_aarch64_fjcvtzs, 2, ops);
> + return ops[0].value;
> +  }
>  
>  case AARCH64_SIMD_BUILTIN_FCMLA_LANEQ0_V2SF:
>  case AARCH64_SIMD_BUILTIN_FCMLA_LANEQ90_V2SF:


RE: [PATCH] aarch64: Do not alter value on a force_reg returned rtx expanding __jcvt

2020-09-21 Thread Kyrylo Tkachov
Hi Andrea,

> -Original Message-
> From: Gcc-patches  On Behalf Of
> Richard Sandiford
> Sent: 21 September 2020 11:58
> To: Andrea Corallo 
> Cc: Richard Earnshaw ; nd ;
> gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] aarch64: Do not alter value on a force_reg returned rtx
> expanding __jcvt
> 
> Andrea Corallo  writes:
> > Hi all,
> >
> > From the `force_reg` description comment I see the returned register
> > should not be modified, thus IIUC should not be used as a GEN_FCN
> > target.
> >
> > Assuming my interpretation is correct this fix this case inside
> > `aarch64_general_expand_builtin` while expanding expanding the
> > `__jcvt` intrinsic.  If is not the case please discard.
> 
> Good catch.

Can you please also backport it to the appropriate branches as well after some 
time on trunk.
Thanks,
Kyrill

> 
> > Regtested and bootsraped on aarch64-linux-gnu.
> >
> >   Andrea
> >
> > From 403ad66b8f9c108d7f38b406ed1afcb603b7e25f Mon Sep 17 00:00:00
> 2001
> > From: Andrea Corallo 
> > Date: Thu, 17 Sep 2020 17:17:52 +0100
> > Subject: [PATCH] aarch64: Do not alter value on a force_reg returned rtx
> >  expanding __jcvt
> >
> > 2020-09-17  Andrea Corallo  
> >
> > * config/aarch64/aarch64-builtins.c
> > (aarch64_general_expand_builtin): Use expand machinery not to
> > alter the value of an rtx returned by force_reg.
> 
> OK, thanks.
> 
> Richard
> 
> > ---
> >  gcc/config/aarch64/aarch64-builtins.c | 16 
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/gcc/config/aarch64/aarch64-builtins.c
> b/gcc/config/aarch64/aarch64-builtins.c
> > index 4f33dd936c7..b787719cf5e 100644
> > --- a/gcc/config/aarch64/aarch64-builtins.c
> > +++ b/gcc/config/aarch64/aarch64-builtins.c
> > @@ -2128,14 +2128,14 @@ aarch64_general_expand_builtin (unsigned int
> fcode, tree exp, rtx target,
> >return target;
> >
> >  case AARCH64_JSCVT:
> > -  arg0 = CALL_EXPR_ARG (exp, 0);
> > -  op0 = force_reg (DFmode, expand_normal (arg0));
> > -  if (!target)
> > -   target = gen_reg_rtx (SImode);
> > -  else
> > -   target = force_reg (SImode, target);
> > -  emit_insn (GEN_FCN (CODE_FOR_aarch64_fjcvtzs) (target, op0));
> > -  return target;
> > +  {
> > +   expand_operand ops[2];
> > +   create_output_operand (&ops[0], target, SImode);
> > +   op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
> > +   create_input_operand (&ops[1], op0, DFmode);
> > +   expand_insn (CODE_FOR_aarch64_fjcvtzs, 2, ops);
> > +   return ops[0].value;
> > +  }
> >
> >  case AARCH64_SIMD_BUILTIN_FCMLA_LANEQ0_V2SF:
> >  case AARCH64_SIMD_BUILTIN_FCMLA_LANEQ90_V2SF:


Re: [PATCH] gcov: fix TOPN streaming from shared libraries

2020-09-21 Thread Martin Liška

On 9/6/20 1:24 PM, Sergei Trofimovich wrote:

From: Sergei Trofimovich 

Before the change gcc did not stream correctly TOPN counters
if counters belonged to a non-local shared object.

As a result zero-section optimization generated TOPN sections
in a form not recognizable by '__gcov_merge_topn'.

The problem happens because in a case of multiple shared objects
'__gcov_merge_topn' function is present in address space multiple
times (once per each object).

The fix is to never rely on function address and predicate on TOPN
counter types.


Hello.

Thank you for the analysis! I think it's the correct fix and it's probably
similar to what we used to see for indirect_call_tuple.

@Alexander: Am I right?

Thanks,
Martin



libgcc/ChangeLog:

PR gcov-profile/96913
* libgcov-driver.c (write_one_data): Avoid function pointer
comparison in TOP streaming decision.
---
  libgcc/libgcov-driver.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
index 58914268d4e..86a6b5ad68a 100644
--- a/libgcc/libgcov-driver.c
+++ b/libgcc/libgcov-driver.c
@@ -424,10 +424,15 @@ write_one_data (const struct gcov_info *gi_ptr,
  
  	  n_counts = ci_ptr->num;
  
-	  if (gi_ptr->merge[t_ix] == __gcov_merge_topn)

+ /* Do not zero-compress top counters because:
+  * - __gcv_merge_topn does not handle such sections
+  * - GCOV_COUNTER_V_INDIR contains non-zero keys
+  */
+ if (t_ix == GCOV_COUNTER_V_TOPN || t_ix == GCOV_COUNTER_V_INDIR)
write_top_counters (ci_ptr, t_ix, n_counts);
  else
{
+
  /* Do not stream when all counters are zero.  */
  int all_zeros = 1;
  for (unsigned i = 0; i < n_counts; i++)





Re: [PATCH] Cygwin/MinGW: Do not version lto plugins

2020-09-21 Thread Martin Liška

On 9/10/20 1:57 PM, JonY via Gcc-patches wrote:

On 9/10/20 9:44 AM, Richard Biener wrote:


I can confirm liblto is still loaded correctly from the logs, likewise
renaming it away will cause an error.

Seems to be fine on Linux.


OK then.

Thanks,
Richard.



Thanks for reviewing, pushed to master branch
ae6cf62861b5e9acb518b016ddbe7f783206f65f.



Hello.

I see the patch broke auto-loading support in bintuils which
automatically try to load plugins in bfd-plugins folder:

One example:
[  108s] ar cr libbuiltins.a builtins.o alias.o bind.o break.o builtin.o 
caller.o cd.o colon.o command.o common.o declare.o echo.o enable.o eval.o 
evalfile.o evalstring.o exec.o exit.o fc.o fg_bg.o hash.o help.o history.o 
jobs.o kill.o let.o mapfile.o pushd.o read.o return.o set.o setattr.o shift.o 
source.o suspend.o test.o times.o trap.o type.o ulimit.o umask.o wait.o 
getopts.o shopt.o printf.o getopt.o bashgetopt.o complete.o
[  108s] ar: builtins.o: plugin needed to handle lto object

Thanks,
Martin


Re: [PATCH] Fix ICE in tree-switch-conversion.

2020-09-21 Thread Richard Biener via Gcc-patches
On Mon, Sep 21, 2020 at 12:53 PM Martin Liška  wrote:
>
> With SVE we can end up with:
> switch (POLY_INT_CST [2, 2])  [INV], case 2:  [INV], case 
> 4:  [INV]>
> which is fine to expand and we can remove the assert.
>
> Ready to be installed?

OK.

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> PR tree-optimization/96915
> * tree-switch-conversion.c (switch_conversion::expand): Accept
> also integer constants.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/96915
> * gcc.target/aarch64/sve/pr96915.c: New test.
> ---
>   gcc/testsuite/gcc.target/aarch64/sve/pr96915.c | 11 +++
>   gcc/tree-switch-conversion.c   |  3 ---
>   2 files changed, 11 insertions(+), 3 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr96915.c
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr96915.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pr96915.c
> new file mode 100644
> index 000..fae4cd42117
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr96915.c
> @@ -0,0 +1,11 @@
> +/* PR tree-optimization/96915 */
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=armv8.2-a+sve" } */
> +
> +#pragma GCC aarch64 "arm_sve.h"
> +void b() {
> +  switch (svcntd())
> +  case 2:
> +  case 4:
> +b();
> +}
> diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
> index 4b435941d12..186411ff3c4 100644
> --- a/gcc/tree-switch-conversion.c
> +++ b/gcc/tree-switch-conversion.c
> @@ -984,9 +984,6 @@ switch_conversion::expand (gswitch *swtch)
>during gimplification).  */
> gcc_checking_assert (TREE_TYPE (m_index_expr) != error_mark_node);
>
> -  /* A switch on a constant should have been optimized in tree-cfg-cleanup.  
> */
> -  gcc_checking_assert (!TREE_CONSTANT (m_index_expr));
> -
> /* Prefer bit test if possible.  */
> if (tree_fits_uhwi_p (m_range_size)
> && bit_test_cluster::can_be_handled (tree_to_uhwi (m_range_size), 
> m_uniq)
> --
> 2.28.0
>


Re: [PATCH] Cygwin/MinGW: Do not version lto plugins

2020-09-21 Thread Richard Biener via Gcc-patches
On Mon, Sep 21, 2020 at 1:33 PM Martin Liška  wrote:
>
> On 9/10/20 1:57 PM, JonY via Gcc-patches wrote:
> > On 9/10/20 9:44 AM, Richard Biener wrote:
> >>>
> >>> I can confirm liblto is still loaded correctly from the logs, likewise
> >>> renaming it away will cause an error.
> >>>
> >>> Seems to be fine on Linux.
> >>
> >> OK then.
> >>
> >> Thanks,
> >> Richard.
> >>
> >
> > Thanks for reviewing, pushed to master branch
> > ae6cf62861b5e9acb518b016ddbe7f783206f65f.
> >
>
> Hello.
>
> I see the patch broke auto-loading support in bintuils which
> automatically try to load plugins in bfd-plugins folder:
>
> One example:
> [  108s] ar cr libbuiltins.a builtins.o alias.o bind.o break.o builtin.o 
> caller.o cd.o colon.o command.o common.o declare.o echo.o enable.o eval.o 
> evalfile.o evalstring.o exec.o exit.o fc.o fg_bg.o hash.o help.o history.o 
> jobs.o kill.o let.o mapfile.o pushd.o read.o return.o set.o setattr.o shift.o 
> source.o suspend.o test.o times.o trap.o type.o ulimit.o umask.o wait.o 
> getopts.o shopt.o printf.o getopt.o bashgetopt.o complete.o
> [  108s] ar: builtins.o: plugin needed to handle lto object

Isn't that eventually just because the 'gcc' package looks for
liblto_plugin.so.0.0.0 instead of liblto_plugin.so?

Richard.

> Thanks,
> Martin


Re: [PATCH] Cygwin/MinGW: Do not version lto plugins

2020-09-21 Thread Martin Liška

On 9/21/20 1:33 PM, Martin Liška wrote:

On 9/10/20 1:57 PM, JonY via Gcc-patches wrote:

On 9/10/20 9:44 AM, Richard Biener wrote:


I can confirm liblto is still loaded correctly from the logs, likewise
renaming it away will cause an error.

Seems to be fine on Linux.


OK then.

Thanks,
Richard.



Thanks for reviewing, pushed to master branch
ae6cf62861b5e9acb518b016ddbe7f783206f65f.



Hello.

I see the patch broke auto-loading support in bintuils which
automatically try to load plugins in bfd-plugins folder:

One example:
[  108s] ar cr libbuiltins.a builtins.o alias.o bind.o break.o builtin.o 
caller.o cd.o colon.o command.o common.o declare.o echo.o enable.o eval.o 
evalfile.o evalstring.o exec.o exit.o fc.o fg_bg.o hash.o help.o history.o 
jobs.o kill.o let.o mapfile.o pushd.o read.o return.o set.o setattr.o shift.o 
source.o suspend.o test.o times.o trap.o type.o ulimit.o umask.o wait.o 
getopts.o shopt.o printf.o getopt.o bashgetopt.o complete.o
[  108s] ar: builtins.o: plugin needed to handle lto object


Sorry, it's not caused by your patch. It's our SUSE-specific package setup.

Thanks,
Martin



Thanks,
Martin




Re: [PATCH] Cygwin/MinGW: Do not version lto plugins

2020-09-21 Thread Martin Liška

On 9/21/20 1:37 PM, Richard Biener wrote:

Isn't that eventually just because the 'gcc' package looks for
liblto_plugin.so.0.0.0 instead of liblto_plugin.so?


Yes.

Martin


[PATCH] tree-optimization/97135 - fix dependence check in store-motion

2020-09-21 Thread Richard Biener
The following fixes a dependence check where in the particular place
we cannot ignore self-dependences.

Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed.

2020-09-21  Richard Biener  

PR tree-optimization/97135
* tree-ssa-loop-im.c (sm_seq_push_down): Do not ignore
self-dependences.

* gcc.dg/torture/pr97135.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr97135.c | 21 +
 gcc/tree-ssa-loop-im.c |  8 +---
 2 files changed, 26 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr97135.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr97135.c 
b/gcc/testsuite/gcc.dg/torture/pr97135.c
new file mode 100644
index 000..223f4d05b85
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr97135.c
@@ -0,0 +1,21 @@
+/* { dg-do run } */
+
+long long e, *d = &e;
+int a, b, c;
+
+int
+main ()
+{
+  for (; c <= 5; c++)
+for (b = 0; b <= 5; b++)
+  {
+   for (a = 1; a <= 5; a++)
+ ;
+   *d = 0;
+   if (c)
+ break;
+  }
+  if (a != 6)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index f87c287d742..139c7e76e66 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -2232,9 +2232,11 @@ sm_seq_push_down (vec &seq, unsigned ptr, 
unsigned *at)
  || (against.second == sm_other && against.from != NULL_TREE))
/* Found the tail of the sequence.  */
break;
-  if (!refs_independent_p (memory_accesses.refs_list[new_cand.first],
-  memory_accesses.refs_list[against.first],
-  false))
+  /* We may not ignore self-dependences here.  */
+  if (new_cand.first == against.first
+ || !refs_independent_p (memory_accesses.refs_list[new_cand.first],
+ memory_accesses.refs_list[against.first],
+ false))
/* ???  Prune new_cand from the list of refs to apply SM to.  */
return false;
   std::swap (new_cand, against);
-- 
2.26.2


Re: [PATCH] aarch64: Do not alter value on a force_reg returned rtx expanding __jcvt

2020-09-21 Thread Andrea Corallo
Kyrylo Tkachov  writes:

> Hi Andrea,
>
>> -Original Message-
>> From: Gcc-patches  On Behalf Of
>> Richard Sandiford
>> Sent: 21 September 2020 11:58
>> To: Andrea Corallo 
>> Cc: Richard Earnshaw ; nd ;
>> gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH] aarch64: Do not alter value on a force_reg returned rtx
>> expanding __jcvt
>>
>> Andrea Corallo  writes:
>> > Hi all,
>> >
>> > From the `force_reg` description comment I see the returned register
>> > should not be modified, thus IIUC should not be used as a GEN_FCN
>> > target.
>> >
>> > Assuming my interpretation is correct this fix this case inside
>> > `aarch64_general_expand_builtin` while expanding expanding the
>> > `__jcvt` intrinsic.  If is not the case please discard.
>>
>> Good catch.
>
> Can you please also backport it to the appropriate branches as well after 
> some time on trunk.
> Thanks,
> Kyrill

Ciao Kyrill,

Sure happy to do that.  For now into trunk as 2c62952f816.

Thanks

  Andrea


Re: [PATCH] libstdc++: Fix division by zero in std::sample

2020-09-21 Thread Patrick Palka via Gcc-patches
On Fri, 18 Sep 2020, Patrick Palka wrote:

> This fixes a division by zero in the selection-sampling std::__search

Whoops, this line say std::__sample, not std::__search.

> overload when the input range is empty (and hence __unsampled_sz is 0).
> 
> Tested on x86_64-pc-linux-gnu.
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/bits/stl_algo.h (__sample): Exit early when the
>   input range is empty.
>   * testsuite/25_algorithms/sample/3.cc: New test.
> ---
>  libstdc++-v3/include/bits/stl_algo.h  |  3 ++
>  .../testsuite/25_algorithms/sample/3.cc   | 50 +++
>  2 files changed, 53 insertions(+)
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/sample/3.cc
> 
> diff --git a/libstdc++-v3/include/bits/stl_algo.h 
> b/libstdc++-v3/include/bits/stl_algo.h
> index a0b96c61798..2478b5857c1 100644
> --- a/libstdc++-v3/include/bits/stl_algo.h
> +++ b/libstdc++-v3/include/bits/stl_algo.h
> @@ -5775,6 +5775,9 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
>using _Gen = remove_reference_t<_UniformRandomBitGenerator>;
>using __uc_type = common_type_t;
>  
> +  if (__first == __last)
> + return __out;
> +
>__distrib_type __d{};
>_Size __unsampled_sz = std::distance(__first, __last);
>__n = std::min(__n, __unsampled_sz);
> diff --git a/libstdc++-v3/testsuite/25_algorithms/sample/3.cc 
> b/libstdc++-v3/testsuite/25_algorithms/sample/3.cc
> new file mode 100644
> index 000..e89c40e27ee
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/25_algorithms/sample/3.cc
> @@ -0,0 +1,50 @@
> +// Copyright (C) 2020 Free Software Foundation, Inc.
> +//
> +// This file is part of the GNU ISO C++ Library.  This library is free
> +// software; you can redistribute it and/or modify it under the
> +// terms of the GNU General Public License as published by the
> +// Free Software Foundation; either version 3, or (at your option)
> +// any later version.
> +
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +// GNU General Public License for more details.
> +
> +// You should have received a copy of the GNU General Public License along
> +// with this library; see the file COPYING3.  If not see
> +// .
> +
> +// { dg-options "-std=gnu++17" }
> +// { dg-do run { target c++17 } }
> +// { dg-require-cstdint "" }
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +std::mt19937 rng;
> +
> +using std::sample;
> +using __gnu_test::test_container;
> +using __gnu_test::output_iterator_wrapper;
> +using __gnu_test::forward_iterator_wrapper;
> +
> +void
> +test01()
> +{
> +  const int in = 0;
> +  test_container pop(&in, &in);
> +  int out;
> +  test_container samp(&out, &out + 1);
> +
> +  auto it = sample(pop.begin(), pop.end(), samp.begin(), 1, rng);
> +  VERIFY( it.ptr == &out );
> +}
> +
> +int
> +main()
> +{
> +  test01();
> +}
> -- 
> 2.28.0.497.g54e85e7af1
> 
> 



[PATCH] x86: Require MMX for __builtin_ia32_maskmovq

2020-09-21 Thread H.J. Lu via Gcc-patches
Since "MASKMOVQ mm1, mm2" is an SSE instruction which requires MMX and
MMX/SSE ISAs are handled separately, make __builtin_ia32_maskmovq require
MMX instead of SSE.

gcc/

PR target/97140
* config/i386/i386-expand.c (ix86_expand_builtin): Require MMX
for __builtin_ia32_maskmovq.
* config/i386/mmx.md (mmx_maskmovq): Replace TARGET_SSE with
TARGET_MMX.
(*mmx_maskmovq): Likewise.

gcc/testsuite/

PR target/97140
* gcc.target/i386/pr97140.c: New test.
---
 gcc/config/i386/i386-expand.c   |  6 +-
 gcc/config/i386/mmx.md  |  4 ++--
 gcc/testsuite/gcc.target/i386/pr97140.c | 10 ++
 3 files changed, 17 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr97140.c

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index e6f8b314f18..e6285cf592e 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -10982,7 +10982,11 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
subtarget,
== (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4))
   && (isa & (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4)) != 0)
 isa |= (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4);
-  if ((bisa & OPTION_MASK_ISA_MMX) && !TARGET_MMX && TARGET_MMX_WITH_SSE)
+  /* NB: __builtin_ia32_maskmovq requires MMX.  */
+  if (fcode != IX86_BUILTIN_MASKMOVQ
+  && (bisa & OPTION_MASK_ISA_MMX)
+  && !TARGET_MMX
+  && TARGET_MMX_WITH_SSE)
 {
   bisa &= ~OPTION_MASK_ISA_MMX;
   bisa |= OPTION_MASK_ISA_SSE2;
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 7c9640d4f9f..610e4b591f7 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -2549,7 +2549,7 @@ (define_expand "mmx_maskmovq"
  (match_operand:V8QI 2 "register_operand")
  (match_dup 0)]
 UNSPEC_MASKMOV))]
-  "TARGET_SSE || TARGET_3DNOW_A")
+  "TARGET_MMX || TARGET_3DNOW_A")
 
 (define_insn "*mmx_maskmovq"
   [(set (mem:V8QI (match_operand:P 0 "register_operand" "D"))
@@ -2557,7 +2557,7 @@ (define_insn "*mmx_maskmovq"
  (match_operand:V8QI 2 "register_operand" "y")
  (mem:V8QI (match_dup 0))]
 UNSPEC_MASKMOV))]
-  "TARGET_SSE || TARGET_3DNOW_A"
+  "TARGET_MMX || TARGET_3DNOW_A"
   ;; @@@ check ordering of operands in intel/nonintel syntax
   "maskmovq\t{%2, %1|%1, %2}"
   [(set_attr "type" "mmxcvt")
diff --git a/gcc/testsuite/gcc.target/i386/pr97140.c 
b/gcc/testsuite/gcc.target/i386/pr97140.c
new file mode 100644
index 000..edb39d916ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr97140.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mno-mmx -Wno-psabi" } */
+
+typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__));
+typedef char __v8qi __attribute__ ((__vector_size__ (8)));
+void
+_mm_maskmove_si64 (__m64 __A, __m64 __N, char *__P)
+{
+__builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P); /* { dg-error 
"needs isa option -msse -m3dnowa -mmmx" } */
+}
-- 
2.26.2



Re: [PATCH] Cygwin/MinGW: Do not version lto plugins

2020-09-21 Thread JonY via Gcc-patches
On 9/21/20 11:38 AM, Martin Liška wrote:
> Sorry, it's not caused by your patch. It's our SUSE-specific package setup.
> 

How does liblto_plugin.so.0.0.0 get loaded? I find only mentions of
liblto_plugin.so.

Is Suse GCC patched to use the versioned library?



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] x86: Require MMX for __builtin_ia32_maskmovq

2020-09-21 Thread H.J. Lu via Gcc-patches
On Mon, Sep 21, 2020 at 5:54 AM H.J. Lu  wrote:
>
> Since "MASKMOVQ mm1, mm2" is an SSE instruction which requires MMX and
> MMX/SSE ISAs are handled separately, make __builtin_ia32_maskmovq require
> MMX instead of SSE.
>
> gcc/
>
> PR target/97140
> * config/i386/i386-expand.c (ix86_expand_builtin): Require MMX
> for __builtin_ia32_maskmovq.
> * config/i386/mmx.md (mmx_maskmovq): Replace TARGET_SSE with
> TARGET_MMX.
> (*mmx_maskmovq): Likewise.
>
> gcc/testsuite/
>
> PR target/97140
> * gcc.target/i386/pr97140.c: New test.
> ---
>  gcc/config/i386/i386-expand.c   |  6 +-
>  gcc/config/i386/mmx.md  |  4 ++--
>  gcc/testsuite/gcc.target/i386/pr97140.c | 10 ++
>  3 files changed, 17 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr97140.c
>
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index e6f8b314f18..e6285cf592e 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -10982,7 +10982,11 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
> subtarget,
> == (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4))
>&& (isa & (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4)) != 0)
>  isa |= (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4);
> -  if ((bisa & OPTION_MASK_ISA_MMX) && !TARGET_MMX && TARGET_MMX_WITH_SSE)
> +  /* NB: __builtin_ia32_maskmovq requires MMX.  */
> +  if (fcode != IX86_BUILTIN_MASKMOVQ
> +  && (bisa & OPTION_MASK_ISA_MMX)
> +  && !TARGET_MMX
> +  && TARGET_MMX_WITH_SSE)
>  {
>bisa &= ~OPTION_MASK_ISA_MMX;
>bisa |= OPTION_MASK_ISA_SSE2;
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 7c9640d4f9f..610e4b591f7 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -2549,7 +2549,7 @@ (define_expand "mmx_maskmovq"
>   (match_operand:V8QI 2 "register_operand")
>   (match_dup 0)]
>  UNSPEC_MASKMOV))]
> -  "TARGET_SSE || TARGET_3DNOW_A")
> +  "TARGET_MMX || TARGET_3DNOW_A")
>
>  (define_insn "*mmx_maskmovq"
>[(set (mem:V8QI (match_operand:P 0 "register_operand" "D"))
> @@ -2557,7 +2557,7 @@ (define_insn "*mmx_maskmovq"
>   (match_operand:V8QI 2 "register_operand" "y")
>   (mem:V8QI (match_dup 0))]
>  UNSPEC_MASKMOV))]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> +  "TARGET_MMX || TARGET_3DNOW_A"
>;; @@@ check ordering of operands in intel/nonintel syntax
>"maskmovq\t{%2, %1|%1, %2}"
>[(set_attr "type" "mmxcvt")

 Leave mmx.md alone since maskmovq isn't an MMX instruction.

-- 
H.J.
From 8c114567d93b55c83c56a04bb941ce6e3d635435 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 21 Sep 2020 05:33:46 -0700
Subject: [PATCH] x86: Require MMX for __builtin_ia32_maskmovq

Since "MASKMOVQ mm1, mm2" is an SSE instruction which requires MMX and
MMX/SSE ISAs are handled separately, make __builtin_ia32_maskmovq require
MMX instead of SSE.

gcc/

	PR target/97140
	* config/i386/i386-expand.c (ix86_expand_builtin): Require MMX
	for __builtin_ia32_maskmovq.

gcc/testsuite/

	PR target/97140
	* gcc.target/i386/pr97140.c: New test.
---
 gcc/config/i386/i386-expand.c   |  6 +-
 gcc/testsuite/gcc.target/i386/pr97140.c | 10 ++
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr97140.c

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index e6f8b314f18..e6285cf592e 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -10982,7 +10982,11 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget,
== (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4))
   && (isa & (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4)) != 0)
 isa |= (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4);
-  if ((bisa & OPTION_MASK_ISA_MMX) && !TARGET_MMX && TARGET_MMX_WITH_SSE)
+  /* NB: __builtin_ia32_maskmovq requires MMX.  */
+  if (fcode != IX86_BUILTIN_MASKMOVQ
+  && (bisa & OPTION_MASK_ISA_MMX)
+  && !TARGET_MMX
+  && TARGET_MMX_WITH_SSE)
 {
   bisa &= ~OPTION_MASK_ISA_MMX;
   bisa |= OPTION_MASK_ISA_SSE2;
diff --git a/gcc/testsuite/gcc.target/i386/pr97140.c b/gcc/testsuite/gcc.target/i386/pr97140.c
new file mode 100644
index 000..edb39d916ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr97140.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mno-mmx -Wno-psabi" } */
+
+typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__));
+typedef char __v8qi __attribute__ ((__vector_size__ (8)));
+void
+_mm_maskmove_si64 (__m64 __A, __m64 __N, char *__P)
+{
+__builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P); /* { dg-error "needs isa option -msse -m3dnowa -mmmx" } */
+}
-- 
2.26.2



Re: [PATCH] Cygwin/MinGW: Do not version lto plugins

2020-09-21 Thread Martin Liška

On 9/21/20 3:09 PM, JonY wrote:

On 9/21/20 11:38 AM, Martin Liška wrote:

Sorry, it's not caused by your patch. It's our SUSE-specific package setup.



How does liblto_plugin.so.0.0.0 get loaded? I find only mentions of
liblto_plugin.so.


We make a symlink to bfd-plugins folder.



Is Suse GCC patched to use the versioned library?



No, but we create the aforementioned symlink.

Martin


[committed] libstdc++: Make std::assume_aligned a constexpr function [PR 97132]

2020-09-21 Thread Jonathan Wakely via Gcc-patches
The cast from void* to T* in std::assume_aligned is not valid in a
constexpr function. The optimization hint is redundant during constant
evaluation anyway (the compiler can see the object and knows its
alignment). Simply return the original pointer without applying the
__builtin_assume_aligned hint to it when doing constant evaluation.

This change also removes the preprocessor branch that works around
uintptr_t not being available. We already assume that type is present
elsewhere in the library.

libstdc++-v3/ChangeLog:

PR libstdc++/97132
* include/bits/align.h (align) [!_GLIBCXX_USE_C99_STDINT_TR1]:
Remove unused code.
(assume_aligned): Do not use __builtin_assume_aligned during
constant evaluation.
* testsuite/20_util/assume_aligned/1.cc: Improve test.
* testsuite/20_util/assume_aligned/97132.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

This should be backported to gcc-9 and gcc-10 too.

commit f10ed928e2f8ecc2c859abff8f2f9296b11b8d95
Author: Jonathan Wakely 
Date:   Mon Sep 21 14:28:58 2020

libstdc++: Make std::assume_aligned a constexpr function [PR 97132]

The cast from void* to T* in std::assume_aligned is not valid in a
constexpr function. The optimization hint is redundant during constant
evaluation anyway (the compiler can see the object and knows its
alignment). Simply return the original pointer without applying the
__builtin_assume_aligned hint to it when doing constant evaluation.

This change also removes the preprocessor branch that works around
uintptr_t not being available. We already assume that type is present
elsewhere in the library.

libstdc++-v3/ChangeLog:

PR libstdc++/97132
* include/bits/align.h (align) [!_GLIBCXX_USE_C99_STDINT_TR1]:
Remove unused code.
(assume_aligned): Do not use __builtin_assume_aligned during
constant evaluation.
* testsuite/20_util/assume_aligned/1.cc: Improve test.
* testsuite/20_util/assume_aligned/97132.cc: New test.

diff --git a/libstdc++-v3/include/bits/align.h 
b/libstdc++-v3/include/bits/align.h
index c3267f22934..faa92bec2f8 100644
--- a/libstdc++-v3/include/bits/align.h
+++ b/libstdc++-v3/include/bits/align.h
@@ -41,7 +41,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 /**
  *  @brief Fit aligned storage in buffer.
- *  @ingroup memory
  *
  *  This function tries to fit @a __size bytes of storage with alignment
  *  @a __align into the buffer @a __ptr of size @a __space bytes.  If such
@@ -56,18 +55,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  *  @param __space   Size of the buffer pointed to by @a __ptr.
  *  @return the updated pointer if the aligned storage fits, otherwise nullptr.
  *
+ *  @ingroup memory
  */
 inline void*
 align(size_t __align, size_t __size, void*& __ptr, size_t& __space) noexcept
 {
-#ifdef _GLIBCXX_USE_C99_STDINT_TR1
   const auto __intptr = reinterpret_cast(__ptr);
-#else
-  // Cannot use std::uintptr_t so assume that std::size_t can be used instead.
-  static_assert(sizeof(size_t) >= sizeof(void*),
-  "std::size_t must be a suitable substitute for std::uintptr_t");
-  const auto __intptr = reinterpret_cast(__ptr);
-#endif
   const auto __aligned = (__intptr - 1u + __align) & -__align;
   const auto __diff = __aligned - __intptr;
   if ((__size + __diff) > __space)
@@ -86,15 +79,26 @@ align(size_t __align, size_t __size, void*& __ptr, size_t& 
__space) noexcept
*  @tparam _Align An alignment value (i.e. a power of two)
*  @tparam _TpAn object type
*  @param  __ptr  A pointer that is aligned to _Align
+   *
+   *  C++20 20.10.6 [ptr.align]
+   *
*  @ingroup memory
*/
   template
 [[nodiscard,__gnu__::__always_inline__]]
-constexpr _Tp* assume_aligned(_Tp* __ptr)
+constexpr _Tp*
+assume_aligned(_Tp* __ptr) noexcept
 {
   static_assert(std::has_single_bit(_Align));
-  _GLIBCXX_DEBUG_ASSERT((std::uintptr_t)__ptr % _Align == 0);
-  return static_cast<_Tp*>(__builtin_assume_aligned(__ptr, _Align));
+  if (std::is_constant_evaluated())
+   return __ptr;
+  else
+   {
+ // This function is expected to be used in hot code, where
+ // __glibcxx_assert would add unwanted overhead.
+ _GLIBCXX_DEBUG_ASSERT((uintptr_t)__ptr % _Align == 0);
+ return static_cast<_Tp*>(__builtin_assume_aligned(__ptr, _Align));
+   }
 }
 #endif // C++2a
 
diff --git a/libstdc++-v3/testsuite/20_util/assume_aligned/1.cc 
b/libstdc++-v3/testsuite/20_util/assume_aligned/1.cc
index 1a34cc4bc63..16bf22caefe 100644
--- a/libstdc++-v3/testsuite/20_util/assume_aligned/1.cc
+++ b/libstdc++-v3/testsuite/20_util/assume_aligned/1.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=gnu++2a" }
+// { dg-options "-std=gnu++2a -O2" }
 // { dg-do run { target c++2a } }
 

[committed] libstdc++: Relax constraints on transform_view and elements_view iterators

2020-09-21 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/std/ranges (transform_view, elements_view): Relax
constraints on operator- for iterators, as per LWG 3483.
* testsuite/std/ranges/adaptors/elements.cc: Check that we
can take the difference of two iterators from a non-random
access range.
* testsuite/std/ranges/adaptors/transform.cc: Likewise.

Tested powerpc64le-linux. Committed to trunk.

I'll backport this to gcc-10 too.

commit 2ec58cfcea146a61755516ce4ed160827fe0b4ff
Author: Jonathan Wakely 
Date:   Mon Sep 21 14:30:38 2020

libstdc++: Relax constraints on transform_view and elements_view iterators

libstdc++-v3/ChangeLog:

* include/std/ranges (transform_view, elements_view): Relax
constraints on operator- for iterators, as per LWG 3483.
* testsuite/std/ranges/adaptors/elements.cc: Check that we
can take the difference of two iterators from a non-random
access range.
* testsuite/std/ranges/adaptors/transform.cc: Likewise.

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 23a04d61174..005e89f94b2 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -1833,9 +1833,11 @@ namespace views
requires random_access_range<_Base>
  { return {*__i._M_parent, __i._M_current - __n}; }
 
+ // _GLIBCXX_RESOLVE_LIB_DEFECTS
+ // 3483. transform_view::iterator's difference is overconstrained
  friend constexpr difference_type
  operator-(const _Iterator& __x, const _Iterator& __y)
-   requires random_access_range<_Base>
+   requires sized_sentinel_for, iterator_t<_Base>>
  { return __x._M_current - __y._M_current; }
 
  friend constexpr decltype(auto)
@@ -3538,9 +3540,11 @@ namespace views
requires random_access_range<_Base>
  { return _Iterator{__x} -= __y; }
 
+ // _GLIBCXX_RESOLVE_LIB_DEFECTS
+ // 3483. transform_view::iterator's difference is overconstrained
  friend constexpr difference_type
  operator-(const _Iterator& __x, const _Iterator& __y)
-   requires random_access_range<_Base>
+   requires sized_sentinel_for, iterator_t<_Base>>
  { return __x._M_current - __y._M_current; }
 
  friend _Sentinel<_Const>;
diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/elements.cc 
b/libstdc++-v3/testsuite/std/ranges/adaptors/elements.cc
index 3026adf4f28..94dd7c94505 100644
--- a/libstdc++-v3/testsuite/std/ranges/adaptors/elements.cc
+++ b/libstdc++-v3/testsuite/std/ranges/adaptors/elements.cc
@@ -66,9 +66,33 @@ test02()
   VERIFY( ranges::equal(v2, (std::pair[]){{1,2}}) );
 }
 
+struct X
+{
+  using Iter = __gnu_test::forward_iterator_wrapper>;
+
+  friend auto operator-(Iter l, Iter r) { return l.ptr - r.ptr; }
+};
+
+void
+test03()
+{
+  // LWG 3483
+  std::pair x[3];
+  __gnu_test::test_forward_range> r(x);
+  auto v = views::elements<1>(r);
+  auto b = begin(v);
+  static_assert( !ranges::random_access_range );
+  static_assert( std::sized_sentinel_for );
+  VERIFY( (next(b, 1) - b) == 1 );
+  const auto v_const = v;
+  auto b_const = begin(v_const);
+  VERIFY( (next(b_const, 2) - b_const) == 2 );
+}
+
 int
 main()
 {
   test01();
   test02();
+  test03();
 }
diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/transform.cc 
b/libstdc++-v3/testsuite/std/ranges/adaptors/transform.cc
index c14e36e0cef..41a7d3b3321 100644
--- a/libstdc++-v3/testsuite/std/ranges/adaptors/transform.cc
+++ b/libstdc++-v3/testsuite/std/ranges/adaptors/transform.cc
@@ -122,6 +122,29 @@ test05()
   b = ranges::end(v);
 }
 
+struct Y
+{
+  using Iter = __gnu_test::forward_iterator_wrapper;
+
+  friend auto operator-(Iter l, Iter r) { return l.ptr - r.ptr; }
+};
+
+void
+test06()
+{
+  // LWG 3483
+  Y y[3];
+  __gnu_test::test_forward_range r(y);
+  auto v = views::transform(r, std::identity{});
+  auto b = begin(v);
+  static_assert( !ranges::random_access_range );
+  static_assert( std::sized_sentinel_for );
+  VERIFY( (next(b, 1) - b) == 1 );
+  const auto v_const = v;
+  auto b_const = begin(v_const);
+  VERIFY( (next(b_const, 2) - b_const) == 2 );
+}
+
 int
 main()
 {
@@ -130,4 +153,5 @@ main()
   test03();
   test04();
   test05();
+  test06();
 }


*PING* [PATCH] doc: gcc.c: Update documentation for spec files

2020-09-21 Thread Armin Brauns via Gcc-patches
Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553321.html

On 06/09/2020 17.23, Armin Brauns wrote:
> There were some differences between the actual code in do_spec_1, its
> source comment, and the documentation in doc/invoke.texi. These should
> now be resolved.
>



Re: [PATCH] dwarf: Multi-register CFI address support

2020-09-21 Thread Andrew Stubbs

Ping.

On 03/09/2020 16:29, Andrew Stubbs wrote:

On 28/08/2020 13:04, Andrew Stubbs wrote:

Hi all,

This patch introduces DWARF CFI support for architectures that require 
multiple registers to hold pointers, such as the stack pointer, frame 
pointer, and return address. The motivating case is the AMD GCN 
architecture which has 64-bit address pointers, but 32-bit registers.


The current implementation permits program variables to span as many 
registers as they need, but assumes that CFI expressions will only 
need a single register for each frame value.


To be fair, the DWARF standard makes a similar assumption; the 
engineers working on LLVM and GDB, at AMD, have therefore invented 
some new DWARF operators that they plan to propose for a future 
standard. Only one is relevant here, however: DW_OP_LLVM_piece_end. 
(Unfortunately this clashes with an AArch64 extension, but I think we 
can cope using an alias -- only GCC dumps will be confusing.)


My approach is to change the type representing a DWARF register 
throughout the CFI code. This permits the register span information to 
propagate to where it is needed.


I've taken advantage of C++ struct copies and operator== to minimize 
the amount of refactoring required. I'm not sure this meets the GCC 
guidelines exactly, but if not I can change that once the basic form 
is agreed. (I also considered an operator= to make assigning single 
dwreg values transparent, but that hid too many invalid assumptions.)


OK to commit? (Although, I'll hold off until AMD release the 
compatible GDB.)


Minor patch update, following Tom's feedback.

Andrew




Re: [PATCH] irange_pool class

2020-09-21 Thread Andrew MacLeod via Gcc-patches

On 9/19/20 4:32 PM, Martin Sebor wrote:

On 9/18/20 3:09 PM, Andrew MacLeod wrote:

On 9/18/20 4:35 PM, Martin Sebor wrote:
Do you really need 6 or 10 subranges to find out the answer to the 
questions you are looking for?  most of the time, 2 or 3 pairs 
carries all the information anyone needs and its efficient switches 
are the biggest beneficiary of the multiple ranges, allowing us to be 
quite precise on what reaches the interior of a case or the default.


the next question is, how many of these do you need?  The range is 
doing it with there allocator because it could in theory have #BB * 
#SSA_NAMES, which could be a lot.    if you have just a single or 2 
vectors over ssa-names, and that is sparsley filled, just use 
int-range-max.


The use case I'm thinking of would have an irange of some size for
every decl or result of an allocation call accessed in a function,
plus another irange for every SSA_NAME that's an object pointer.
The size of the first irange would be that of the allocation
argument in the first case.  In the second case it would be
the combined range of the offsets the pointer from whatever it
points to (e.g., in p0 = &x; p1 = p0 + i; p2 = p1 + j; p2's
offset range would be Ri + Rj where R is the value range of i
or j.

It probably doesn't makes sense to keep track of 255 subranges
(or even many more than 5) but in compliance with the guidance
in the irange best practices document to write code for [as if]
infinite subranges, the data structures should be free of any
hardwired limit.  So I envision I might have something like
a pair of dynamic_range members in each of these objects (along
with the SSA_NAME and other stuff), and some runtime parameter
to cap the number of subranges to some reasonable limit, merging
those in excess of it.



Furthermore, there are 2 other things at play.

1)  The nature of the ranger is that it stores everything, and you just 
need to ask for the range.  if its an ssa_name, unless you are adjusting 
the range somehow, the ranger is already storing it, so all you need to 
do is ask for it when you want it, and its readily available any time.
   Given this, *most* of the time passes shouldn't need to actually 
store a range.. you just retrieve it when you want it.  I do not believe 
any of the passes Aldy converted required storing ranges. However, I do 
recognize there are going to be times when a pass may need to store or 
associate something else with  a range, thus we exposed the 
functionality earlier than i was going to.


2) We have taken a significant performance hit by converting irange to 
be represented with trees rather than the original wide_int 
implementation.  At some point (maybe sooner than later) , Id like to go 
back to the wide int internal representation.  When we do so, storage 
needs will go up considerably.  Up until the "merge" of value_range and 
irange to trees, we actually had another object called irange_storage 
which was a memory efficient representation of ranges for longer term 
storage.
  If/when we were to switch back to wide_int, the pool allocator would 
then return an irange_storage object rather than a irange *...    It 
would not be ideal  for an irange * or any kind of int_range to be 
kept in memory by any pass.. but rather stored to memory thru  the 
irange_storage class.


the basic principle we used was was to use int_range_max to load the 
range from storage, manipulate and get results, than store back thru the 
irange storage class. We have for the moment dropped the irange_storage 
class since it would simply be a typedef of an "irange *" today... and 
so it just looked like noise with no way to enforce a behaviour.


so I would encourage use of the allocator for any kind of longer term 
storage if its really needed, as it will be a much simpler translation 
if/when we make the switch back.


Most passes that need storage should surely be able to create an 
allocator for the pass and make use of it.   The pass has to create a 
ranger, so it'd have the same scope as the ranger.   we could 
potentially  expose allocation from the rangers own allocator, but that 
shouldnt be necessary,. if you can create a ranger, you can create an 
allocator if it is needed


Andrew



Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Qing Zhao via Gcc-patches



> On Sep 18, 2020, at 5:51 PM, Segher Boessenkool  
> wrote:
> 
> Hi!
> 
> On Fri, Sep 18, 2020 at 03:31:12PM -0500, Qing Zhao wrote:
>> Let me know your opinion:
>> 
>> A.  Will not provide default definition in middle end to generate the 
>> zeroing insn for selected registers.  Move the generation work all to 
>> target; X86 implementation will be provided;
>> 
>> OR:
>> 
>> B.  Will provide a default definition in middle end to generate the zeroing 
>> insn for selected registers. Then need to add a new target hook 
>> “ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)”, same as A, X86 implementation 
>> will be provided in my patch. 
> 
> Is this just to make the xor thing work?  i386 has a peephole to
> transform the mov to a xor for this (and the backend could just handle
> it in its mov patterns, maybe a peephole was easier for i386, no
> idea).

You mean what’s the purpose of the new target hook 
“ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)?

The purpose of this new target hook is for the target to delete some of the 
call_used registers that should not be zeroed, for example, the stack registers 
in X86. (St0-st7). 
For other platforms, there might be other call_used registers that should not 
be zeroed. 

Qing

> 
> 
> Segher



Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Qing Zhao via Gcc-patches



> On Sep 21, 2020, at 2:23 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao mailto:qing.z...@oracle.com>> writes:
>> Hi, Richard,
>> 
>> During my implementation of the new version of the patch. I still feel that 
>> it’s not practical to add a default definition in the middle end to just use 
>> move patterns to zero each selected register. 
>> 
>> The major issues are:
>> 
>> There are some target specific information on how to define “general 
>> register” set and “all register” set,  we have to add a new specific target 
>> hook to get such target specific information and pass to middle-end. 
> 
> GENERAL_REGS and ALL_REGS are already concepts that target-independent
> code knows about though.  I think the non-fixed subsets of those would
> make good starting sets, which the target could whittle down it wanted
> or needed to.

Yes, this is what I am currently doing:  

First, the middle end computes the initial need_zeroed_hardregs based on user 
request, data flow, and function abi. Then pass this “need_zeroed_hardregs” to 
target hook;
Then, the target hook will delete some of the registers that should not be 
zeroed in that specific target from “need_zeroed_hardregs”, for example, 
stack_regs on x86.

> 
>> For example, on X86, for CALL_USED_REGISTERS, we have:
>> 
>> #define CALL_USED_REGISTERS \
>> /*ax,dx,cx,bx,si,di,bp,sp,st,st1,st2,st3,st4,st5,st6,st7*/  \
>> {  1, 1, 1, 0, 4, 4, 0, 1, 1,  1,  1,  1,  1,  1,  1,  1,   \
>> /*arg,flags,fpsr,frame*/\
>>1,   1,1,1, \
>> /*xmm0,xmm1,xmm2,xmm3,xmm4,xmm5,xmm6,xmm7*/ \
>> 1,   1,   1,   1,   1,   1,   6,   6,  \
>> /* mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7*/   
>> 
>> 
>> From the above, we can see “st0 to st7” are call_used_registers for x86, 
>> however, we should not zero these registers on x86. 
>> 
>> Such details is only known by x86 backend. 
>> 
>> I guess that other platforms might have similar issue. 
> 
> They might, but that doesn't disprove that there's a sensisble default
> choice that works for most targets.
> 
> FWIW, stack registers themselves are already exposed outside targets
> (see reg-stack.c, although since x86 is the only port that uses it,
> the main part of it is effectively target-dependent at the moment).
> Similarly for register windows.

Yes, the stack_regs currently can be referenced as STACK_REG_P in middle end. 
So for X86, we might be able to identify this in middle end.

However, my major concern is other platforms that we are not very familiar 
with, there might be some special registers on that platform that should not be 
zeroed,  and currently, there is no way to identify them in middle end.

For such platform, the default handler will not be correct. 
> 
>> If we still want  a default definition in middle end to generate the zeroing 
>> insn for selected registers, I have to add another target hook, say, 
>> “ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)” to check whether a register should 
>> be zeroed based on gpr_only (general register only)  and target specific 
>> decision.   I will provide a x86 implementation for this target hook in this 
>> patch. 
>> 
>> Other targets have to implement this new target hook to utilize the default 
>> handler. 
>> 
>> Let me know your opinion:
>> 
>> A.  Will not provide default definition in middle end to generate the 
>> zeroing insn for selected registers.  Move the generation work all to 
>> target; X86 implementation will be provided;
>> 
>> OR:
>> 
>> B.  Will provide a default definition in middle end to generate the zeroing 
>> insn for selected registers. Then need to add a new target hook 
>> “ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)”, same as A, X86 implementation 
>> will be provided in my patch. 
> 
> The kind of target hook interface I was thinking of was:
> 
>  HARD_REG_SET TARGET_EMIT_MOVE_ZEROS (const HARD_REG_SET ®s)
> 
> which:
> 
> - emits zeroing instructions for some target-specific subset of REGS
> 
> - returns the set of registers that were actually cleared
> 
> The default implementation would clear all registers in REGS,
> using reg_raw_mode[R] as the mode for register R.  Targets could
> then override the hook and:
> 
> - drop registers that shouldn't be cleared
> 
> - handle some or all of the remaining registers in a more optimal,
>  target-specific way
> 
> The targets could then use the default implementation of the hook
> to handle any residue.  E.g. the default implementation would be
> able to handle general registers on x86.

Even for the general registers on X86, we need some special optimization for 
optimal code generation, for example, we might want to optimize 
A “mov” to xor on X86;

My major concern with the default implementation of the hook is:

If a target has some special registers that should not be zeroed, and we do not 
provide an overridden implementa

Re: [PATCH] libstdc++: Rebase include/pstl to current upstream

2020-09-21 Thread Jonathan Wakely via Gcc-patches

On 15/09/20 20:35 -0700, Thomas Rodgers wrote:

From: Thomas Rodgers 

From llvm-project/pstl @ 0b2e0e80d96

libstdc++-v3/ChangeLog:

* include/pstl/algorithm_impl.h: Update file.
* include/pstl/execution_impl.h: Likewise.
* include/pstl/glue_algorithm_impl.h: Likewise.
* include/pstl/glue_memory_impl.h: Likewise.
* include/pstl/glue_numeric_impl.h: Likewise.
* include/pstl/memory_impl.h: Likewise.
* include/pstl/numeric_impl.h: Likewise.
* include/pstl/parallel_backend.h: Likewise.
* include/pstl/parallel_backend_serial.h: Likewise.
* include/pstl/parallel_backend_tbb.h: Likewise.
* include/pstl/parallel_backend_utils.h: Likewise.
* include/pstl/pstl_config.h: Likewise.
* include/pstl/unseq_backend_simd.h: Likewise.
---
libstdc++-v3/include/pstl/algorithm_impl.h| 181 ++--
libstdc++-v3/include/pstl/execution_impl.h|   4 +-
.../include/pstl/glue_algorithm_impl.h| 543 +--
libstdc++-v3/include/pstl/glue_memory_impl.h  | 264 ++---
libstdc++-v3/include/pstl/glue_numeric_impl.h |  68 +-
libstdc++-v3/include/pstl/memory_impl.h   |  67 +-
libstdc++-v3/include/pstl/numeric_impl.h  |   8 +-
libstdc++-v3/include/pstl/parallel_backend.h  |   8 +
.../include/pstl/parallel_backend_serial.h|   8 +-
.../include/pstl/parallel_backend_tbb.h   | 903 +++---
.../include/pstl/parallel_backend_utils.h | 248 +++--
libstdc++-v3/include/pstl/pstl_config.h   |  24 +-
.../include/pstl/unseq_backend_simd.h |  39 +-
13 files changed, 1586 insertions(+), 779 deletions(-)

diff --git a/libstdc++-v3/include/pstl/glue_algorithm_impl.h 
b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
index 379de4033ec..d2e30529f78 100644
--- a/libstdc++-v3/include/pstl/glue_algorithm_impl.h
+++ b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
@@ -757,8 +743,7 @@ 
__pstl::__internal::__enable_if_execution_policy<_ExecutionPolicy, bool>
equal(_ExecutionPolicy&& __exec, _ForwardIterator1 __first1, _ForwardIterator1 
__last1, _ForwardIterator2 __first2,
  _ForwardIterator2 __last2)
{
-return std::equal(std::forward<_ExecutionPolicy>(__exec), __first1, 
__last1, __first2, __last2,
-  __pstl::__internal::__pstl_equal());
+return equal(std::forward<_ExecutionPolicy>(__exec), __first1, __last1, 
__first2, __last2, std::equal_to<>());


Any idea why this is now called unqualified? I don't think we want ADL
here.



diff --git a/libstdc++-v3/include/pstl/parallel_backend_tbb.h 
b/libstdc++-v3/include/pstl/parallel_backend_tbb.h
index 9c05ade0532..4476486d548 100644
--- a/libstdc++-v3/include/pstl/parallel_backend_tbb.h
+++ b/libstdc++-v3/include/pstl/parallel_backend_tbb.h


This file is full of non-reserved names, like _root and _x_orig and
move_y_range.

Fixing those upstream might take a while though.




Re: [PATCH] Fix overflow handling in std::align

2020-09-21 Thread Glen Fernandes via Gcc-patches
On Mon, Sep 14, 2020 at 5:44 PM Thomas Rodgers  wrote:
> > On Sep 14, 2020, at 7:30 AM, Ville Voutilainen  wrote:
> >
> > On Mon, 14 Sep 2020 at 15:49, Glen Fernandes  wrote:
> >> Sounds like a good idea. Updated patch attached.
> >
> > Looks good to me.
>
> Agree.

Rebased patch on latest changes to bits/align.h.


Fix overflow handling in align

2020-09-20  Glen Joseph Fernandes  

* include/bits/align.h (align): Fix overflow handling.
* testsuite/20_util/align/3.cc: New tests.

Glen
commit f18840a2b03e927e296adef8b1a13fdf255e1828
Author: Glen Joseph Fernandes 
Date:   Mon Sep 14 01:21:27 2020 -0400

Fix overflow handling in align

2020-09-20  Glen Joseph Fernandes  

* include/bits/align.h (align): Fix overflow handling.
* testsuite/20_util/align/3.cc: New tests.

diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index 28b66ccca7a..a26faef547e 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,8 @@
+2020-09-20  Glen Joseph Fernandes  
+
+* include/bits/align.h (align): Fix overflow handling.
+* testsuite/20_util/align/3.cc: New tests.
+
 2020-09-20  Jonathan Wakely  
 
PR libstdc++/97101
diff --git a/libstdc++-v3/include/bits/align.h 
b/libstdc++-v3/include/bits/align.h
index faa92bec2f8..597b4103ed8 100644
--- a/libstdc++-v3/include/bits/align.h
+++ b/libstdc++-v3/include/bits/align.h
@@ -60,10 +60,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 inline void*
 align(size_t __align, size_t __size, void*& __ptr, size_t& __space) noexcept
 {
+  if (__space < __size)
+return nullptr;
   const auto __intptr = reinterpret_cast(__ptr);
   const auto __aligned = (__intptr - 1u + __align) & -__align;
   const auto __diff = __aligned - __intptr;
-  if ((__size + __diff) > __space)
+  if (__diff > (__space - __size))
 return nullptr;
   else
 {
diff --git a/libstdc++-v3/testsuite/20_util/align/3.cc 
b/libstdc++-v3/testsuite/20_util/align/3.cc
new file mode 100644
index 000..74116a59867
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/align/3.cc
@@ -0,0 +1,53 @@
+// { dg-do run { target c++11 } }
+
+// 2020-09-20 Glen Joseph Fernandes 
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the terms
+// of the GNU General Public License as published by the Free Software
+// Foundation; either version 3, or (at your option) any later
+// version.
+
+// This library is distributed in the hope that it will be useful, but
+// WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+// General Public License for more details.
+
+// You should have received a copy of the GNU General Public License
+// along with this library; see the file COPYING3.  If not see
+// .
+
+// C++11 [ptr.align] (20.6.5): std::align
+
+#include 
+#include 
+
+void test01()
+{
+  void* p1 = reinterpret_cast(5);
+  void* p2 = p1;
+  std::size_t s1 = 3072;
+  std::size_t s2 = s1;
+  VERIFY(std::align(1024, static_cast(-1), p1, s1) == nullptr);
+  VERIFY(p1 == p2);
+  VERIFY(s1 == s2);
+}
+
+void test02()
+{
+  void* p1 = reinterpret_cast(1);
+  void* p2 = p1;
+  std::size_t s1 = -1;
+  std::size_t s2 = s1;
+  VERIFY(std::align(2, static_cast(-1), p1, s1) == nullptr);
+  VERIFY(p1 == p2);
+  VERIFY(s1 == s2);
+}
+
+int main()
+{
+  test01();
+  test02();
+}


Re: [PATCH] Fix overflow handling in std::align

2020-09-21 Thread Jonathan Wakely via Gcc-patches

On 21/09/20 10:42 -0400, Glen Fernandes via Libstdc++ wrote:

On Mon, Sep 14, 2020 at 5:44 PM Thomas Rodgers  wrote:

> On Sep 14, 2020, at 7:30 AM, Ville Voutilainen  wrote:
>
> On Mon, 14 Sep 2020 at 15:49, Glen Fernandes  wrote:
>> Sounds like a good idea. Updated patch attached.
>
> Looks good to me.

Agree.


Rebased patch on latest changes to bits/align.h.


Oh nice, I was about to do that myself.

I'll get the patch committed today, thanks!



Fix overflow handling in align

2020-09-20  Glen Joseph Fernandes  

   * include/bits/align.h (align): Fix overflow handling.
   * testsuite/20_util/align/3.cc: New tests.

Glen



commit f18840a2b03e927e296adef8b1a13fdf255e1828
Author: Glen Joseph Fernandes 
Date:   Mon Sep 14 01:21:27 2020 -0400

   Fix overflow handling in align

   2020-09-20  Glen Joseph Fernandes  

   * include/bits/align.h (align): Fix overflow handling.
   * testsuite/20_util/align/3.cc: New tests.

diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index 28b66ccca7a..a26faef547e 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,8 @@
+2020-09-20  Glen Joseph Fernandes  
+
+* include/bits/align.h (align): Fix overflow handling.
+* testsuite/20_util/align/3.cc: New tests.
+
2020-09-20  Jonathan Wakely  

PR libstdc++/97101
diff --git a/libstdc++-v3/include/bits/align.h 
b/libstdc++-v3/include/bits/align.h
index faa92bec2f8..597b4103ed8 100644
--- a/libstdc++-v3/include/bits/align.h
+++ b/libstdc++-v3/include/bits/align.h
@@ -60,10 +60,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
inline void*
align(size_t __align, size_t __size, void*& __ptr, size_t& __space) noexcept
{
+  if (__space < __size)
+return nullptr;
  const auto __intptr = reinterpret_cast(__ptr);
  const auto __aligned = (__intptr - 1u + __align) & -__align;
  const auto __diff = __aligned - __intptr;
-  if ((__size + __diff) > __space)
+  if (__diff > (__space - __size))
return nullptr;
  else
{
diff --git a/libstdc++-v3/testsuite/20_util/align/3.cc 
b/libstdc++-v3/testsuite/20_util/align/3.cc
new file mode 100644
index 000..74116a59867
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/align/3.cc
@@ -0,0 +1,53 @@
+// { dg-do run { target c++11 } }
+
+// 2020-09-20 Glen Joseph Fernandes 
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the terms
+// of the GNU General Public License as published by the Free Software
+// Foundation; either version 3, or (at your option) any later
+// version.
+
+// This library is distributed in the hope that it will be useful, but
+// WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+// General Public License for more details.
+
+// You should have received a copy of the GNU General Public License
+// along with this library; see the file COPYING3.  If not see
+// .
+
+// C++11 [ptr.align] (20.6.5): std::align
+
+#include 
+#include 
+
+void test01()
+{
+  void* p1 = reinterpret_cast(5);
+  void* p2 = p1;
+  std::size_t s1 = 3072;
+  std::size_t s2 = s1;
+  VERIFY(std::align(1024, static_cast(-1), p1, s1) == nullptr);
+  VERIFY(p1 == p2);
+  VERIFY(s1 == s2);
+}
+
+void test02()
+{
+  void* p1 = reinterpret_cast(1);
+  void* p2 = p1;
+  std::size_t s1 = -1;
+  std::size_t s2 = s1;
+  VERIFY(std::align(2, static_cast(-1), p1, s1) == nullptr);
+  VERIFY(p1 == p2);
+  VERIFY(s1 == s2);
+}
+
+int main()
+{
+  test01();
+  test02();
+}




Re: [PATCH] libstdc++: Mark some more algorithms constexpr for C++20

2020-09-21 Thread Jonathan Wakely via Gcc-patches

On 18/09/20 21:08 -0400, Patrick Palka via Libstdc++ wrote:

As per P0202.

Tested on x86_64-pc-linux-gnu.

libstdc++-v3/ChangeLog:

* include/bits/stl_algo.h (for_each_n): Mark constexpr for C++20.
(search): Likewise for the overload that takes a searcher.
* testsuite/25_algorithms/for_each/constexpr.cc: Test constexpr
std::for_each_n.
* testsuite/25_algorithms/search/constexpr.cc: Test constexpr
std::search overload that takes a searcher.
---
libstdc++-v3/include/bits/stl_algo.h |  2 ++
.../testsuite/25_algorithms/for_each/constexpr.cc| 12 
.../testsuite/25_algorithms/search/constexpr.cc  |  4 
3 files changed, 18 insertions(+)


OK, thanks.




[PATCH] tree-optimization/97139 - fix BB SLP live lane extraction

2020-09-21 Thread Richard Biener
This fixes SLP live lane extraction with pattern stmts.

Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed.

2020-09-21  Richard Biener  

PR tree-optimization/97139
* tree-vect-slp.c (vect_bb_slp_mark_live_stmts): Only mark the
pattern root, track visited vectorized stmts.

* gcc.dg/vect/pr97139.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr97139.c | 27 +++
 gcc/tree-vect-slp.c | 10 +++---
 2 files changed, 34 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr97139.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr97139.c 
b/gcc/testsuite/gcc.dg/vect/pr97139.c
new file mode 100644
index 000..1b9f31c7db3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr97139.c
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target vect_int } */
+
+#include "tree-vect.h"
+
+int pix[4];
+
+int __attribute__((noipa)) foo (void)
+{
+  pix[0] = pix[0] / 4;
+  pix[1] = pix[1] / 4;
+  pix[2] = pix[2] / 4;
+  pix[3] = pix[3] / 4;
+  return pix[0] + pix[1] + pix[2] + pix[3];
+}
+
+int main ()
+{
+  check_vect ();
+
+  pix[0] = 8;
+  pix[1] = 16;
+  pix[2] = 32;
+  pix[3] = 64;
+  if (foo () != 30)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index ef62c2dff2e..c44fd396bf0 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3021,10 +3021,14 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, 
slp_tree node,
   bool all_visited = true;
   FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
 {
-  stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
-  if (svisited.contains (orig_stmt_info))
+  if (svisited.contains (stmt_info))
continue;
   all_visited = false;
+  stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
+  if (STMT_VINFO_IN_PATTERN_P (orig_stmt_info)
+ && STMT_VINFO_RELATED_STMT (orig_stmt_info) != stmt_info)
+   /* Only the pattern root stmt computes the original scalar value.  */
+   continue;
   bool mark_visited = true;
   gimple *orig_stmt = orig_stmt_info->stmt;
   ssa_op_iter op_iter;
@@ -3091,7 +3095,7 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, 
slp_tree node,
}
}
   if (mark_visited)
-   svisited.add (orig_stmt_info);
+   svisited.add (stmt_info);
 }
   if (all_visited)
 return;
-- 
2.26.2


Re: [PATCH] libstdc++: Fix division by zero in std::sample

2020-09-21 Thread Jonathan Wakely via Gcc-patches

On 18/09/20 21:08 -0400, Patrick Palka via Libstdc++ wrote:

This fixes a division by zero in the selection-sampling std::__search
overload when the input range is empty (and hence __unsampled_sz is 0).

Tested on x86_64-pc-linux-gnu.

libstdc++-v3/ChangeLog:

* include/bits/stl_algo.h (__sample): Exit early when the
input range is empty.
* testsuite/25_algorithms/sample/3.cc: New test.
---
libstdc++-v3/include/bits/stl_algo.h  |  3 ++
.../testsuite/25_algorithms/sample/3.cc   | 50 +++
2 files changed, 53 insertions(+)
create mode 100644 libstdc++-v3/testsuite/25_algorithms/sample/3.cc


OK, thanks.




Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-21 Thread Martin Sebor via Gcc-patches

On 9/20/20 12:39 AM, Aldy Hernandez wrote:



On 9/19/20 11:22 PM, Martin Sebor wrote:

On 9/18/20 12:29 AM, Aldy Hernandez wrote:



On 9/17/20 10:18 PM, Martin Sebor wrote:

On 9/17/20 12:39 PM, Andrew MacLeod wrote:

On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote:

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in 
calls to allocation functions


gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an 
allocation, then they can do all kinds of bad things.  In fact, 
when my son was asking me attack vectors, this is one I said I'd 
look at if I were a bad guy.



I'm a bit surprised you can't just query the range of the 
argument and get the overflow status out of that range, but I 
don't see that in the APIs.  How painful would it be to make that 
part of the API? The conceptual model would be to just ask for 
the range of the argument to malloc which would include the range 
and a status bit indicating the computation might have overflowed.


  Do we know if it did/would have wrapped? sure.  since we have to 
do the math.    so you are correct in that the information is 
there. but is it useful?


We are in the very annoying habit of subtracting one by adding 
0xFFF.  which means you get an overflow for unsigned when you 
subtract one.   From what I have seen of unsigned math, we would be 
flagging very many operations as overflows, so you would still have 
the difficulty of figuring out whether its a "real" overflow or a 
fake one because of the way we do unsigned math


You and me both :)



At the very start, I did have an overflow flag in the range 
class... but it was turning out to be fairly useless so it was 
removed.

.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.


SO again, we get back to adding 0x0f when we are trying to 
subtract one...  now, with infinite precision you are going to see


  [2,10]  - 1  we end up with [2,10]+0xFF, which will now 
give you  [0x10001, 0x10009]    so its going to look like 
it overflowed?





But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
    if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

    return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 
0x3fffc] is too large to represent in ‘long unsigned 
int’ [-Walloc-size-larger-than=]

    6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of cases.
It could be generalized and exposed by a more generic API but it does
seem like the ranger must already have all the logic built into it so
if it isn't exposed now it should be a matter of opening it up.


everything is exposed in range-ops.  well, mostly.
if we have _1 = _6 * 4

if one wanted to do that infinite precision, you query the range 
for _6, and the range for 4 (which would be [4,4] :-)

range_of_expr (op1r, _6, stmt)
range_of_expr (op2r, 4, stmt)

you could take their current types, and cast those ranges to 
whatever the next higher precsion is,

range_cast  (op1r, highertype)
range_cast (op2r, highertype)
then invoke the operation on those parameters

gimple_range_fold (r, stmt,  op1r, op2r)

and that will do your operation in the higher precision.  you could 
compare that to the value in regular precision too i suppose.


The patch does pretty much exactly what you described, except in
offset_int, and only for a limited set of arit

Re: [PATCH] libstdc++: Rebase include/pstl to current upstream

2020-09-21 Thread Thomas Rodgers



> On Sep 21, 2020, at 7:40 AM, Jonathan Wakely  wrote:
> 
> On 15/09/20 20:35 -0700, Thomas Rodgers wrote:
>> From: Thomas Rodgers 
>> 
>> From llvm-project/pstl @ 0b2e0e80d96
>> 
>> libstdc++-v3/ChangeLog:
>> 
>>  * include/pstl/algorithm_impl.h: Update file.
>>  * include/pstl/execution_impl.h: Likewise.
>>  * include/pstl/glue_algorithm_impl.h: Likewise.
>>  * include/pstl/glue_memory_impl.h: Likewise.
>>  * include/pstl/glue_numeric_impl.h: Likewise.
>>  * include/pstl/memory_impl.h: Likewise.
>>  * include/pstl/numeric_impl.h: Likewise.
>>  * include/pstl/parallel_backend.h: Likewise.
>>  * include/pstl/parallel_backend_serial.h: Likewise.
>>  * include/pstl/parallel_backend_tbb.h: Likewise.
>>  * include/pstl/parallel_backend_utils.h: Likewise.
>>  * include/pstl/pstl_config.h: Likewise.
>>  * include/pstl/unseq_backend_simd.h: Likewise.
>> ---
>> libstdc++-v3/include/pstl/algorithm_impl.h| 181 ++--
>> libstdc++-v3/include/pstl/execution_impl.h|   4 +-
>> .../include/pstl/glue_algorithm_impl.h| 543 +--
>> libstdc++-v3/include/pstl/glue_memory_impl.h  | 264 ++---
>> libstdc++-v3/include/pstl/glue_numeric_impl.h |  68 +-
>> libstdc++-v3/include/pstl/memory_impl.h   |  67 +-
>> libstdc++-v3/include/pstl/numeric_impl.h  |   8 +-
>> libstdc++-v3/include/pstl/parallel_backend.h  |   8 +
>> .../include/pstl/parallel_backend_serial.h|   8 +-
>> .../include/pstl/parallel_backend_tbb.h   | 903 +++---
>> .../include/pstl/parallel_backend_utils.h | 248 +++--
>> libstdc++-v3/include/pstl/pstl_config.h   |  24 +-
>> .../include/pstl/unseq_backend_simd.h |  39 +-
>> 13 files changed, 1586 insertions(+), 779 deletions(-)
>> 
>> diff --git a/libstdc++-v3/include/pstl/glue_algorithm_impl.h 
>> b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
>> index 379de4033ec..d2e30529f78 100644
>> --- a/libstdc++-v3/include/pstl/glue_algorithm_impl.h
>> +++ b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
>> @@ -757,8 +743,7 @@ 
>> __pstl::__internal::__enable_if_execution_policy<_ExecutionPolicy, bool>
>> equal(_ExecutionPolicy&& __exec, _ForwardIterator1 __first1, 
>> _ForwardIterator1 __last1, _ForwardIterator2 __first2,
>>  _ForwardIterator2 __last2)
>> {
>> -return std::equal(std::forward<_ExecutionPolicy>(__exec), __first1, 
>> __last1, __first2, __last2,
>> -  __pstl::__internal::__pstl_equal());
>> +return equal(std::forward<_ExecutionPolicy>(__exec), __first1, __last1, 
>> __first2, __last2, std::equal_to<>());
> 
> Any idea why this is now called unqualified? I don't think we want ADL
> here.
> 
I’m sure it is related to ... 
> 
>> diff --git a/libstdc++-v3/include/pstl/parallel_backend_tbb.h 
>> b/libstdc++-v3/include/pstl/parallel_backend_tbb.h
>> index 9c05ade0532..4476486d548 100644
>> --- a/libstdc++-v3/include/pstl/parallel_backend_tbb.h
>> +++ b/libstdc++-v3/include/pstl/parallel_backend_tbb.h
> 
> This file is full of non-reserved names, like _root and _x_orig and
> move_y_range.
> 

The upstream authors not being sufficiently versed in thinking in terms of 
writing things up front to avoid the sort of issues that a stdlib requires of 
the code.
 
> Fixing those upstream might take a while though.

I have already started accumulating a set of patches for upstream which I’ll 
manage as independently of getting this rebase into gcc.

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Richard Sandiford
Qing Zhao  writes:
> My major concern with the default implementation of the hook is:
>
> If a target has some special registers that should not be zeroed, and we do 
> not provide an overridden implementation for this target, then the default 
> implementation will generate incorrect code for this target. 

That's OK.  The default behaviour of hooks and macros often needs
to be corrected by target code.  For example, going back to some
of the macros and hooks we talked about earlier:

- EPILOGUE_USES by default returns false for all registers.
  This would be the wrong behaviour for any target that currently
  defines EPILOGUE_USES to something else.

- TARGET_HARD_REGNO_SCRATCH_OK by default returns true for all registers.
  This would be the wrong behaviour for any target that currently defines
  the hook to do something else.

And in general, if there's a target-specific reason that something
has to be different from normal, it's better where possible to expose
the underlying concept that makes that different behaviour necessary,
rather than expose the downstream effects of that concept.  For example,
IMO it's a historical mistake that targets that support interrupt
handlers need to change all of:

- TARGET_HARD_REGNO_SCRATCH_OK
- HARD_REGNO_RENAME_OK
- EPILOGUE_USES

to expose what is essentially one concept.  IMO we should instead
just expose the fact that certain functions have extra call-saved
registers.  (This is now possible with the function_abi stuff,
but most interrupt handler support predates that.)

So if there is some concept that prevents your new target hook being
correct for x86, I think we should try if possible to expose that
concept to target-independent code.  And in the case of stack registers,
that has already been done.

The same would apply to any other target for which the default turns out
not to be correct.

But in cases where there is no underlying concept that can sensibly
be extracted out, it's OK if targets need to override the default
to get correct behaviour.

Thanks,
Richard


Re: [RS6000] rs6000_rtx_costs cost IOR

2020-09-21 Thread Segher Boessenkool
Hi!

On Thu, Sep 17, 2020 at 01:12:19PM +0930, Alan Modra wrote:
> On Wed, Sep 16, 2020 at 07:02:06PM -0500, Segher Boessenkool wrote:
> > > +   /* Test both regs even though the one in the mask is
> > > +  constrained to be equal to the output.  Increasing
> > > +  cost may well result in rejecting an invalid insn
> > > +  earlier.  */
> > 
> > Is that ever actually useful?
> 
> Possibly not in this particular case, but I did see cases where
> invalid insns were rejected early by costing non-reg sub-expressions.

But does that ever change generated code?

This makes the compiler a lot harder to read and understand.  To the
point that such micro-optimisations makes worthwhile optimisations hard
or impossible to do.


Segher


[PATCH] [arm] gcc.target/arm/cs*: Use dg-add-options arm_arch_v8_1m_main

2020-09-21 Thread Christophe Lyon via Gcc-patches
These testcases need thumb mode, which may not be the default.

Using dg-add-options arm_arch_v8_1m_main ensures that -mthumb is used
and makes the test pass in more configurations.

2020-09-21  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/csinc-1.c: Use dg-add-options
arm_arch_v8_1m_main.
* gcc.target/arm/csinv-1.c: Likewise.
* gcc.target/arm/csneg.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/csinc-1.c | 3 ++-
 gcc/testsuite/gcc.target/arm/csinv-1.c | 3 ++-
 gcc/testsuite/gcc.target/arm/csneg.c   | 3 ++-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/csinc-1.c 
b/gcc/testsuite/gcc.target/arm/csinc-1.c
index b992849..255e6e8 100644
--- a/gcc/testsuite/gcc.target/arm/csinc-1.c
+++ b/gcc/testsuite/gcc.target/arm/csinc-1.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
-/* { dg-options "-O2 -march=armv8.1-m.main" } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_arch_v8_1m_main } */
 
 int
 test_csinc32_condasn1(int w0, int w1, int w2, int w3)
diff --git a/gcc/testsuite/gcc.target/arm/csinv-1.c 
b/gcc/testsuite/gcc.target/arm/csinv-1.c
index 6b5383a..28450a4 100644
--- a/gcc/testsuite/gcc.target/arm/csinv-1.c
+++ b/gcc/testsuite/gcc.target/arm/csinv-1.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
-/* { dg-options "-O2 -march=armv8.1-m.main" } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_arch_v8_1m_main } */
 
 int
 test_csinv32_condasn1(int w0, int w1, int w2, int w3)
diff --git a/gcc/testsuite/gcc.target/arm/csneg.c 
b/gcc/testsuite/gcc.target/arm/csneg.c
index e486062..cf3df13 100644
--- a/gcc/testsuite/gcc.target/arm/csneg.c
+++ b/gcc/testsuite/gcc.target/arm/csneg.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
-/* { dg-options "-O2 -march=armv8.1-m.main" } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_arch_v8_1m_main } */
 
 int
 test_csneg32_condasn1(int w0, int w1, int w2, int w3)
-- 
2.7.4



Re: [RS6000] rotate and mask constants

2020-09-21 Thread Segher Boessenkool
On Tue, Sep 15, 2020 at 04:46:08PM +0930, Alan Modra wrote:
> On Tue, Sep 15, 2020 at 10:49:46AM +0930, Alan Modra wrote:
> > Implement more two insn constants.
> 
> And tests.  rot_cst1 checks the values generated, rot_cst2 checks
> instruction count.
> 
>   * gcc.target/powerpc/rot_cst.h,
>   * gcc.target/powerpc/rot_cst1.c,
>   * gcc.target/powerpc/rot_cst2.c: New tests.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/rot_cst1.c
> @@ -0,0 +1,68 @@
> +/* { dg-do run { target lp64 } } */

This doesn't need lp64.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/rot_cst2.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile { target lp64 } } */
> +/* { dg-options "-O2" } */
> +
> +#include "rot_cst.h"
> +
> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 122 } } */

Please write (in comments) how much of each insn are expected, and
possibly for what function?  Also, bonus points if you make this work
for 32 bit as well (it is almost required even).


Segher


Re: [PATCH] libstdc++: Rebase include/pstl to current upstream

2020-09-21 Thread Jonathan Wakely via Gcc-patches

On 21/09/20 08:19 -0700, Thomas Rodgers wrote:




On Sep 21, 2020, at 7:40 AM, Jonathan Wakely  wrote:

On 15/09/20 20:35 -0700, Thomas Rodgers wrote:

From: Thomas Rodgers 

From llvm-project/pstl @ 0b2e0e80d96

libstdc++-v3/ChangeLog:

* include/pstl/algorithm_impl.h: Update file.
* include/pstl/execution_impl.h: Likewise.
* include/pstl/glue_algorithm_impl.h: Likewise.
* include/pstl/glue_memory_impl.h: Likewise.
* include/pstl/glue_numeric_impl.h: Likewise.
* include/pstl/memory_impl.h: Likewise.
* include/pstl/numeric_impl.h: Likewise.
* include/pstl/parallel_backend.h: Likewise.
* include/pstl/parallel_backend_serial.h: Likewise.
* include/pstl/parallel_backend_tbb.h: Likewise.
* include/pstl/parallel_backend_utils.h: Likewise.
* include/pstl/pstl_config.h: Likewise.
* include/pstl/unseq_backend_simd.h: Likewise.
---
libstdc++-v3/include/pstl/algorithm_impl.h| 181 ++--
libstdc++-v3/include/pstl/execution_impl.h|   4 +-
.../include/pstl/glue_algorithm_impl.h| 543 +--
libstdc++-v3/include/pstl/glue_memory_impl.h  | 264 ++---
libstdc++-v3/include/pstl/glue_numeric_impl.h |  68 +-
libstdc++-v3/include/pstl/memory_impl.h   |  67 +-
libstdc++-v3/include/pstl/numeric_impl.h  |   8 +-
libstdc++-v3/include/pstl/parallel_backend.h  |   8 +
.../include/pstl/parallel_backend_serial.h|   8 +-
.../include/pstl/parallel_backend_tbb.h   | 903 +++---
.../include/pstl/parallel_backend_utils.h | 248 +++--
libstdc++-v3/include/pstl/pstl_config.h   |  24 +-
.../include/pstl/unseq_backend_simd.h |  39 +-
13 files changed, 1586 insertions(+), 779 deletions(-)

diff --git a/libstdc++-v3/include/pstl/glue_algorithm_impl.h 
b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
index 379de4033ec..d2e30529f78 100644
--- a/libstdc++-v3/include/pstl/glue_algorithm_impl.h
+++ b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
@@ -757,8 +743,7 @@ 
__pstl::__internal::__enable_if_execution_policy<_ExecutionPolicy, bool>
equal(_ExecutionPolicy&& __exec, _ForwardIterator1 __first1, _ForwardIterator1 
__last1, _ForwardIterator2 __first2,
 _ForwardIterator2 __last2)
{
-return std::equal(std::forward<_ExecutionPolicy>(__exec), __first1, 
__last1, __first2, __last2,
-  __pstl::__internal::__pstl_equal());
+return equal(std::forward<_ExecutionPolicy>(__exec), __first1, __last1, 
__first2, __last2, std::equal_to<>());


Any idea why this is now called unqualified? I don't think we want ADL
here.


I’m sure it is related to ...



diff --git a/libstdc++-v3/include/pstl/parallel_backend_tbb.h 
b/libstdc++-v3/include/pstl/parallel_backend_tbb.h
index 9c05ade0532..4476486d548 100644
--- a/libstdc++-v3/include/pstl/parallel_backend_tbb.h
+++ b/libstdc++-v3/include/pstl/parallel_backend_tbb.h


This file is full of non-reserved names, like _root and _x_orig and
move_y_range.



The upstream authors not being sufficiently versed in thinking in terms of 
writing things up front to avoid the sort of issues that a stdlib requires of 
the code.


Fixing those upstream might take a while though.


I have already started accumulating a set of patches for upstream which I’ll 
manage as independently of getting this rebase into gcc.


Here's a patch to fix
https://bugs.llvm.org/show_bug.cgi?id=47601
and
https://bugs.llvm.org/show_bug.cgi?id=47601
by essentially rewriting the entire file!


diff --git a/libstdc++-v3/include/pstl/execution_impl.h b/libstdc++-v3/include/pstl/execution_impl.h
index d53fd6ffd32..b3a8030fca3 100644
--- a/libstdc++-v3/include/pstl/execution_impl.h
+++ b/libstdc++-v3/include/pstl/execution_impl.h
@@ -22,45 +22,9 @@ namespace __internal
 
 using namespace __pstl::execution;
 
-/* predicate */
-
-template 
-std::false_type __lazy_and(_Tp, std::false_type)
-{
-return std::false_type{};
-};
-
-template 
-inline _Tp
-__lazy_and(_Tp __a, std::true_type)
-{
-return __a;
-}
-
-template 
-std::true_type __lazy_or(_Tp, std::true_type)
-{
-return std::true_type{};
-};
-
-template 
-inline _Tp
-__lazy_or(_Tp __a, std::false_type)
-{
-return __a;
-}
-
 /* iterator */
-template 
-struct __is_random_access_iterator
-{
-static constexpr bool value = __internal::__is_random_access_iterator<_IteratorType>::value &&
-  __internal::__is_random_access_iterator<_OtherIteratorTypes...>::value;
-typedef std::integral_constant type;
-};
-
 template 
-struct __is_random_access_iterator<_IteratorType>
+struct __is_random_access_iterator
 : std::is_same::iterator_category, std::random_access_iterator_tag>
 {
 };
@@ -74,86 +38,68 @@ struct __policy_traits
 template <>
 struct __policy_traits
 {
-typedef std::false_type allow_parallel;
-typedef std::false_type allow_unsequenced;
-typedef std::false_type allow_vector;
+typedef std::false_type __allow_parallel;
+typedef std::false_type

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Qing Zhao via Gcc-patches



> On Sep 21, 2020, at 10:35 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>> My major concern with the default implementation of the hook is:
>> 
>> If a target has some special registers that should not be zeroed, and we do 
>> not provide an overridden implementation for this target, then the default 
>> implementation will generate incorrect code for this target. 
> 
> That's OK.  The default behaviour of hooks and macros often needs
> to be corrected by target code.  For example, going back to some
> of the macros and hooks we talked about earlier:
> 
> - EPILOGUE_USES by default returns false for all registers.
>  This would be the wrong behaviour for any target that currently
>  defines EPILOGUE_USES to something else.
> 
> - TARGET_HARD_REGNO_SCRATCH_OK by default returns true for all registers.
>  This would be the wrong behaviour for any target that currently defines
>  the hook to do something else.
> 
> And in general, if there's a target-specific reason that something
> has to be different from normal, it's better where possible to expose
> the underlying concept that makes that different behaviour necessary,
> rather than expose the downstream effects of that concept.  For example,
> IMO it's a historical mistake that targets that support interrupt
> handlers need to change all of:
> 
> - TARGET_HARD_REGNO_SCRATCH_OK
> - HARD_REGNO_RENAME_OK
> - EPILOGUE_USES
> 
> to expose what is essentially one concept.  IMO we should instead
> just expose the fact that certain functions have extra call-saved
> registers.  (This is now possible with the function_abi stuff,
> but most interrupt handler support predates that.)
> 
> So if there is some concept that prevents your new target hook being
> correct for x86, I think we should try if possible to expose that
> concept to target-independent code.  And in the case of stack registers,
> that has already been done.

I will exclude “stack registers” in the middle end to see whether this can 
resolve the issue with X86. 
> 
> The same would apply to any other target for which the default turns out
> not to be correct.
> 
> But in cases where there is no underlying concept that can sensibly
> be extracted out, it's OK if targets need to override the default
> to get correct behaviour.

Then, on the target that the default code is not right, and we haven’t provide 
overridden implementation, what should we inform the end user about this?
The user might see the documentation about -fzero-call-used-regs in gcc manual, 
and might try it on that specific target, but the default implementation is not 
correct, how to deal this?

Qing
> 
> Thanks,
> Richard



Re: [PATCH] IBM Z: Try to make use of load-and-test instructions

2020-09-21 Thread Andreas Krebbel via Gcc-patches
On 18.09.20 13:10, Stefan Schulze Frielinghaus wrote:
> This patch enables a peephole2 optimization which transforms a load of
> constant zero into a temporary register which is then finally used to
> compare against a floating-point register of interest into a single load
> and test instruction.  However, the optimization is only applied if both
> registers are dead afterwards and if we test for (in)equality only.
> This is relaxed in case of fast math.
> 
> This is a follow up to PR88856.
> 
> Bootstrapped and regtested on IBM Z.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md ("*cmp_ccs_0", "*cmp_ccz_0",
>   "*cmp_ccs_0_fastmath"): Basically change "*cmp_ccs_0" into
>   "*cmp_ccz_0" and for fast math add "*cmp_ccs_0_fastmath".
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/load-and-test-fp-1.c: Change test to include all
>   possible combinations of dead/live registers and comparisons (equality,
>   relational).
>   * gcc.target/s390/load-and-test-fp-2.c: Same as load-and-test-fp-1.c
>   but for fast math.
>   * gcc.target/s390/load-and-test-fp.h: New test included by
>   load-and-test-fp-{1,2}.c.

Ok for mainline. Please see below for some comments.

Thanks!

Andreas

> ---
>  gcc/config/s390/s390.md   | 54 +++
>  .../gcc.target/s390/load-and-test-fp-1.c  | 19 +++
>  .../gcc.target/s390/load-and-test-fp-2.c  | 17 ++
>  .../gcc.target/s390/load-and-test-fp.h| 12 +
>  4 files changed, 67 insertions(+), 35 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/s390/load-and-test-fp.h
> 
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index 4c3e5400a2b..e591aa7c324 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -1391,23 +1391,55 @@
>  ; (TF|DF|SF|TD|DD|SD) instructions
>  
>  
> -; FIXME: load and test instructions turn SNaN into QNaN what is not
> -; acceptable if the target will be used afterwards.  On the other hand
> -; they are quite convenient for implementing comparisons with 0.0. So
> -; try to enable them via splitter/peephole if the value isn't needed anymore.
> -; See testcases: load-and-test-fp-1.c and load-and-test-fp-2.c
> +; load and test instructions turn a signaling NaN into a quiet NaN.  Thus 
> they
> +; may only be used if the target register is dead afterwards or if fast math
> +; is enabled.  The former is done via a peephole optimization.  Note, load 
> and
> +; test instructions may only be used for (in)equality comparisons because
> +; relational comparisons must treat a quiet NaN like a signaling NaN which is
> +; not the case for load and test instructions.  For fast math insn
> +; "cmp_ccs_0_fastmath" applies.
> +; See testcases load-and-test-fp-{1,2}.c
> +
> +(define_peephole2
> +  [(set (match_operand:FP 0 "register_operand")
> + (match_operand:FP 1 "const0_operand"))
> +   (set (reg:CCZ CC_REGNUM)
> + (compare:CCZ (match_operand:FP 2 "register_operand")
> +  (match_operand:FP 3 "register_operand")))]
> +  "TARGET_HARD_FLOAT
> +   && FP_REG_P (operands[2])
> +   && REGNO (operands[0]) == REGNO (operands[3])
> +   && peep2_reg_dead_p (2, operands[0])
> +   && peep2_reg_dead_p (2, operands[2])"
> +  [(parallel
> +[(set (reg:CCZ CC_REGNUM)
> +   (match_op_dup 4 [(match_dup 2) (match_dup 1)]))
> + (clobber (match_dup 2))])]
> +  "operands[4] = gen_rtx_COMPARE (CCZmode, operands[2], operands[1]);")

Couldn't this be written as:

 [(parallel
[(set (reg:CCZ CC_REGNUM)
  (compare:CCZ (match_dup 2) (match_dup 1)))
 (clobber (match_dup 2))])])

>  
>  ; ltxbr, ltdbr, ltebr, ltxtr, ltdtr
> -(define_insn "*cmp_ccs_0"
> -  [(set (reg CC_REGNUM)
> - (compare (match_operand:FP 0 "register_operand"  "f")
> -  (match_operand:FP 1 "const0_operand""")))
> -   (clobber (match_operand:FP  2 "register_operand" "=0"))]
> -  "s390_match_ccmode(insn, CCSmode) && TARGET_HARD_FLOAT"
> +(define_insn "*cmp_ccz_0"
> +  [(set (reg:CCZ CC_REGNUM)
> + (compare:CCZ (match_operand:FP 0 "register_operand" "f")
> +  (match_operand:FP 1 "const0_operand")))
> +   (clobber (match_operand:FP 2 "register_operand" "=0"))]
> +  "TARGET_HARD_FLOAT"
>"ltr\t%0,%0"
> [(set_attr "op_type" "RRE")
>  (set_attr "type"  "fsimp")])
>  
> +(define_insn "*cmp_ccs_0_fastmath"
> +  [(set (reg CC_REGNUM)
> + (compare (match_operand:FP 0 "register_operand" "f")
> +  (match_operand:FP 1 "const0_operand")))]
> +  "s390_match_ccmode (insn, CCSmode)
> +   && TARGET_HARD_FLOAT
> +   && !flag_trapping_math
> +   && !flag_signaling_nans"
> +  "ltr\t%0,%0"
> +  [(set_attr "op_type" "RRE")
> +   (set_attr "type" "fsimp")])
> +
>  ; VX: TFmode in FPR pairs: use cxbr instead of wfcxb
>  ; cxtr, cdtr, cxbr, cdbr, cebr, cdb, ceb, wfcsb, wfcdb
>  (define_insn "*cmp_ccs"
> diff --git a/gcc/testsuite/gcc.target/s390/load-and-test-fp-1.c 
> b/gcc/testsuite/gcc.target/

Re: [PATCH] gcov: fix TOPN streaming from shared libraries

2020-09-21 Thread Alexander Monakov via Gcc-patches
On Mon, 21 Sep 2020, Martin Liška wrote:

> On 9/6/20 1:24 PM, Sergei Trofimovich wrote:
> > From: Sergei Trofimovich 
> > 
> > Before the change gcc did not stream correctly TOPN counters
> > if counters belonged to a non-local shared object.
> > 
> > As a result zero-section optimization generated TOPN sections
> > in a form not recognizable by '__gcov_merge_topn'.
> > 
> > The problem happens because in a case of multiple shared objects
> > '__gcov_merge_topn' function is present in address space multiple
> > times (once per each object).
> > 
> > The fix is to never rely on function address and predicate on TOPN
> > counter types.
> 
> Hello.
> 
> Thank you for the analysis! I think it's the correct fix and it's probably
> similar to what we used to see for indirect_call_tuple.
> 
> @Alexander: Am I right?

Yes, analysis presented by Sergei in Bugzilla looks correct. Pedantically I
wouldn't say the indirect call issue was similar: it's a different gotcha
arising from mixing static and dynamic linking. There we had some symbols
preempted by the main executable (but not all symbols), here we have lack
of preemption/unification as relevant libgcov symbol is hidden.

I cannot judge if the fix is correct (don't know the code that well) but it
looks reasonable. If you could come up with a clearer wording for the new
comment it would be nice, I struggled to understand it.

Thanks.
Alexander


Re: [PATCH] libstdc++: Rebase include/pstl to current upstream

2020-09-21 Thread Thomas Rodgers via Gcc-patches


Thanks, I'll apply locally (and sync those changes to the patch I'm
preparing for upstream).

Jonathan Wakely writes:

> On 21/09/20 08:19 -0700, Thomas Rodgers wrote:
>>
>>
>>> On Sep 21, 2020, at 7:40 AM, Jonathan Wakely  wrote:
>>>
>>> On 15/09/20 20:35 -0700, Thomas Rodgers wrote:
 From: Thomas Rodgers 

 From llvm-project/pstl @ 0b2e0e80d96

 libstdc++-v3/ChangeLog:

* include/pstl/algorithm_impl.h: Update file.
* include/pstl/execution_impl.h: Likewise.
* include/pstl/glue_algorithm_impl.h: Likewise.
* include/pstl/glue_memory_impl.h: Likewise.
* include/pstl/glue_numeric_impl.h: Likewise.
* include/pstl/memory_impl.h: Likewise.
* include/pstl/numeric_impl.h: Likewise.
* include/pstl/parallel_backend.h: Likewise.
* include/pstl/parallel_backend_serial.h: Likewise.
* include/pstl/parallel_backend_tbb.h: Likewise.
* include/pstl/parallel_backend_utils.h: Likewise.
* include/pstl/pstl_config.h: Likewise.
* include/pstl/unseq_backend_simd.h: Likewise.
 ---
 libstdc++-v3/include/pstl/algorithm_impl.h| 181 ++--
 libstdc++-v3/include/pstl/execution_impl.h|   4 +-
 .../include/pstl/glue_algorithm_impl.h| 543 +--
 libstdc++-v3/include/pstl/glue_memory_impl.h  | 264 ++---
 libstdc++-v3/include/pstl/glue_numeric_impl.h |  68 +-
 libstdc++-v3/include/pstl/memory_impl.h   |  67 +-
 libstdc++-v3/include/pstl/numeric_impl.h  |   8 +-
 libstdc++-v3/include/pstl/parallel_backend.h  |   8 +
 .../include/pstl/parallel_backend_serial.h|   8 +-
 .../include/pstl/parallel_backend_tbb.h   | 903 +++---
 .../include/pstl/parallel_backend_utils.h | 248 +++--
 libstdc++-v3/include/pstl/pstl_config.h   |  24 +-
 .../include/pstl/unseq_backend_simd.h |  39 +-
 13 files changed, 1586 insertions(+), 779 deletions(-)

 diff --git a/libstdc++-v3/include/pstl/glue_algorithm_impl.h 
 b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
 index 379de4033ec..d2e30529f78 100644
 --- a/libstdc++-v3/include/pstl/glue_algorithm_impl.h
 +++ b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
 @@ -757,8 +743,7 @@ 
 __pstl::__internal::__enable_if_execution_policy<_ExecutionPolicy, bool>
 equal(_ExecutionPolicy&& __exec, _ForwardIterator1 __first1, 
 _ForwardIterator1 __last1, _ForwardIterator2 __first2,
  _ForwardIterator2 __last2)
 {
 -return std::equal(std::forward<_ExecutionPolicy>(__exec), __first1, 
 __last1, __first2, __last2,
 -  __pstl::__internal::__pstl_equal());
 +return equal(std::forward<_ExecutionPolicy>(__exec), __first1, 
 __last1, __first2, __last2, std::equal_to<>());
>>>
>>> Any idea why this is now called unqualified? I don't think we want ADL
>>> here.
>>>
>>I’m sure it is related to ...
>>>
 diff --git a/libstdc++-v3/include/pstl/parallel_backend_tbb.h 
 b/libstdc++-v3/include/pstl/parallel_backend_tbb.h
 index 9c05ade0532..4476486d548 100644
 --- a/libstdc++-v3/include/pstl/parallel_backend_tbb.h
 +++ b/libstdc++-v3/include/pstl/parallel_backend_tbb.h
>>>
>>> This file is full of non-reserved names, like _root and _x_orig and
>>> move_y_range.
>>>
>>
>>The upstream authors not being sufficiently versed in thinking in terms of 
>>writing things up front to avoid the sort of issues that a stdlib requires of 
>>the code.
>>
>>> Fixing those upstream might take a while though.
>>
>>I have already started accumulating a set of patches for upstream which I’ll 
>>manage as independently of getting this rebase into gcc.
>
> Here's a patch to fix
> https://bugs.llvm.org/show_bug.cgi?id=47601
> and
> https://bugs.llvm.org/show_bug.cgi?id=47601
> by essentially rewriting the entire file!



[r11-3315 Regression] FAIL: g++.dg/ext/timevar2.C -std=gnu++98 (test for excess errors) on Linux/x86_64 (-m64 -march=cascadelake)

2020-09-21 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

79f4e20dd1280e6a44736070b0d5213f9a8f85d4 is the first bad commit
commit 79f4e20dd1280e6a44736070b0d5213f9a8f85d4
Author: Martin Liska 
Date:   Wed Sep 2 14:30:16 2020 +0200

Use SIZE_AMOUNT macro for GGC memory allocation numbers.

caused

FAIL: g++.dg/ext/timevar2.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/ext/timevar2.C  -std=gnu++98 (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3315/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/ext/timevar2.C 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[r11-3315 Regression] FAIL: g++.dg/ext/timevar2.C -std=gnu++98 (test for excess errors) on Linux/x86_64 (-m64)

2020-09-21 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

79f4e20dd1280e6a44736070b0d5213f9a8f85d4 is the first bad commit
commit 79f4e20dd1280e6a44736070b0d5213f9a8f85d4
Author: Martin Liska 
Date:   Wed Sep 2 14:30:16 2020 +0200

Use SIZE_AMOUNT macro for GGC memory allocation numbers.

caused

FAIL: g++.dg/ext/timevar2.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/ext/timevar2.C  -std=gnu++98 (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3315/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/ext/timevar2.C 
--target_board='unix{-m64}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PING 2][PATCH 2/5] C front end support to detect out-of-bounds accesses to array parameters

2020-09-21 Thread Vaseeharan Vinayagamoorthy
After this patch, I am seeing this -Warray-parameter error:

In file included from ../include/pthread.h:1,
 from ../sysdeps/nptl/thread_db.h:25,
 from ../nptl/descr.h:32,
 from ../sysdeps/aarch64/nptl/tls.h:44,
 from ../include/errno.h:25,
 from ../sysdeps/unix/sysv/linux/sysdep.h:23,
 from ../sysdeps/unix/sysv/linux/generic/sysdep.h:22,
 from ../sysdeps/unix/sysv/linux/aarch64/sysdep.h:24,
 from :1:
../sysdeps/nptl/pthread.h:734:47: error: argument 1 of type ‘struct 
__jmp_buf_tag *’ declared as a pointer [-Werror=array-parameter=]
  734 | extern int __sigsetjmp (struct __jmp_buf_tag *__env, int __savemask) 
__THROWNL;
  | ~~^
In file included from ../include/setjmp.h:2,
 from ../nptl/descr.h:24,
 from ../sysdeps/aarch64/nptl/tls.h:44,
 from ../include/errno.h:25,
 from ../sysdeps/unix/sysv/linux/sysdep.h:23,
 from ../sysdeps/unix/sysv/linux/generic/sysdep.h:22,
 from ../sysdeps/unix/sysv/linux/aarch64/sysdep.h:24,
 from :1:
../setjmp/setjmp.h:54:46: note: previously declared as an array ‘struct 
__jmp_buf_tag[1]’
   54 | extern int __sigsetjmp (struct __jmp_buf_tag __env[1], int __savemask) 
__THROWNL;
  | ~^~~~
cc1: all warnings being treated as errors


The build/host/target setup is:
Build: x86_64-linux-gnu (Ubuntu 18.04)
Host: x86_64-linux-gnu
Target: aarch64-none-linux-gnu, aarch64_be-none-linux-gnu, 
arm-none-linux-gnueabi, arm-none-linux-gnueabihf



Kind regards
Vasee



On 20/09/2020, 01:02, "Gcc-patches on behalf of Martin Sebor via Gcc-patches" 
 wrote:

On 9/17/20 4:38 PM, Joseph Myers wrote:
> On Wed, 16 Sep 2020, Martin Sebor via Gcc-patches wrote:
> 
>> Attached is an updated revision of the patch.  Besides the tweaks
>> above it also contains a cosmetic change to the warning issued
>> for mismatches in unspecified VLA bounds: it points at the decl
>> with more of them to guide the user to specify them rather than
>> make them all unspecified.
> 
> The previous version of the patch had a while loop as previously discussed
> to handle skipping multiple consecutive cdk_attrs.
> 
> +  next = pd->declarator;
> +  while (next && next->kind == cdk_attrs)
> +   next = next->declarator;
> 
> This version is back to an "if", but I don't see anything else in the new
> version of that function that actually means the "if" would skip multiple
> consecutive cdk_attrs as desired.
> 
> The patch is OK with the "while" restored there.  If for some reason the
> "while" breaks something, we'll need to look in more detail at exactly
> what case isn't being handled correctly by "while".

I guess it was the result of an experiment, trying to see if I could
break it with the 'if'.  I (hope I) put it back and pushed the whole
series.  I had to squash patches 1 and 2 because of a dependency that
I had missed.

Thanks for the review, by the way.  I think the signature validation
we've ended up with is quite a bit more comprehensive than the first
attempt.

Martin



[pushed] Darwin, testsuite : Skip a test that requires ELF.

2020-09-21 Thread Iain Sandoe
Hi,

The symver support is only available to ELF targets.

tested on x86_64-darwin16,
pushed to master
thanks
Iain

gcc/testsuite/ChangeLog:

* gcc.dg/ipa/symver1.c: Skip for Darwin.
---
 gcc/testsuite/gcc.dg/ipa/symver1.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.dg/ipa/symver1.c 
b/gcc/testsuite/gcc.dg/ipa/symver1.c
index 645de7ea259..fca52202cba 100644
--- a/gcc/testsuite/gcc.dg/ipa/symver1.c
+++ b/gcc/testsuite/gcc.dg/ipa/symver1.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if "only works for ELF targets" { *-*-darwin* } } */
 
 __attribute__ ((__symver__ ("foo@VER_2")))
 __attribute__ ((__symver__ ("foo@VER_3")))
-- 
2.24.1




Re: [PING 2][PATCH 2/5] C front end support to detect out-of-bounds accesses to array parameters

2020-09-21 Thread Martin Sebor via Gcc-patches

On 9/21/20 12:20 PM, Vaseeharan Vinayagamoorthy wrote:

After this patch, I am seeing this -Warray-parameter error:

In file included from ../include/pthread.h:1,
  from ../sysdeps/nptl/thread_db.h:25,
  from ../nptl/descr.h:32,
  from ../sysdeps/aarch64/nptl/tls.h:44,
  from ../include/errno.h:25,
  from ../sysdeps/unix/sysv/linux/sysdep.h:23,
  from ../sysdeps/unix/sysv/linux/generic/sysdep.h:22,
  from ../sysdeps/unix/sysv/linux/aarch64/sysdep.h:24,
  from :1:
../sysdeps/nptl/pthread.h:734:47: error: argument 1 of type ‘struct 
__jmp_buf_tag *’ declared as a pointer [-Werror=array-parameter=]
   734 | extern int __sigsetjmp (struct __jmp_buf_tag *__env, int __savemask) 
__THROWNL;
   | ~~^
In file included from ../include/setjmp.h:2,
  from ../nptl/descr.h:24,
  from ../sysdeps/aarch64/nptl/tls.h:44,
  from ../include/errno.h:25,
  from ../sysdeps/unix/sysv/linux/sysdep.h:23,
  from ../sysdeps/unix/sysv/linux/generic/sysdep.h:22,
  from ../sysdeps/unix/sysv/linux/aarch64/sysdep.h:24,
  from :1:
../setjmp/setjmp.h:54:46: note: previously declared as an array ‘struct 
__jmp_buf_tag[1]’
54 | extern int __sigsetjmp (struct __jmp_buf_tag __env[1], int __savemask) 
__THROWNL;
   | ~^~~~
cc1: all warnings being treated as errors


The warning flags differences between the forms of array parameters
in redeclarations of the same function, including pointers vs arrays
as in this instance.  It needs to be suppressed in glibc, either by
making the function declarations consistent, or by #pragma diagnostic.
(IIRC, the pointer declaration comes before struct __jmp_buf_tag has
been defined so simply using the array form there doesn't work without
defining the type first.)

I would expect the warning to be suppressed when using the installed
library thanks to -Wno-system-headers.

Martin




The build/host/target setup is:
Build: x86_64-linux-gnu (Ubuntu 18.04)
Host: x86_64-linux-gnu
Target: aarch64-none-linux-gnu, aarch64_be-none-linux-gnu, 
arm-none-linux-gnueabi, arm-none-linux-gnueabihf



Kind regards
Vasee



On 20/09/2020, 01:02, "Gcc-patches on behalf of Martin Sebor via Gcc-patches" 
 wrote:

 On 9/17/20 4:38 PM, Joseph Myers wrote:
 > On Wed, 16 Sep 2020, Martin Sebor via Gcc-patches wrote:
 >
 >> Attached is an updated revision of the patch.  Besides the tweaks
 >> above it also contains a cosmetic change to the warning issued
 >> for mismatches in unspecified VLA bounds: it points at the decl
 >> with more of them to guide the user to specify them rather than
 >> make them all unspecified.
 >
 > The previous version of the patch had a while loop as previously 
discussed
 > to handle skipping multiple consecutive cdk_attrs.
 >
 > +  next = pd->declarator;
 > +  while (next && next->kind == cdk_attrs)
 > +   next = next->declarator;
 >
 > This version is back to an "if", but I don't see anything else in the new
 > version of that function that actually means the "if" would skip multiple
 > consecutive cdk_attrs as desired.
 >
 > The patch is OK with the "while" restored there.  If for some reason the
 > "while" breaks something, we'll need to look in more detail at exactly
 > what case isn't being handled correctly by "while".

 I guess it was the result of an experiment, trying to see if I could
 break it with the 'if'.  I (hope I) put it back and pushed the whole
 series.  I had to squash patches 1 and 2 because of a dependency that
 I had missed.

 Thanks for the review, by the way.  I think the signature validation
 we've ended up with is quite a bit more comprehensive than the first
 attempt.

 Martin





[PATCH] libstdc++: Remove overzealous static_asserts from std::span

2020-09-21 Thread Patrick Palka via Gcc-patches
For a span with empty static extent, we currently model the
preconditions of front(), back(), and operator[] as if they were
mandates, by using a static_assert to verify that extent != 0.  This
causes us to incorrectly reject valid programs that instantiate these
member functions but never call them.

libstdc++-v3/ChangeLog:

* include/std/span (span::front): Remove static_assert.
(span::back): Likewise.
(span::operator[]): Likewise.
* testsuite/23_containers/span/back_neg.cc: Remove.
* testsuite/23_containers/span/front_neg.cc: Remove.
* testsuite/23_containers/span/index_op_neg.cc: Remove.
---
 libstdc++-v3/include/std/span |  3 --
 .../testsuite/23_containers/span/back_neg.cc  | 29 ---
 .../testsuite/23_containers/span/front_neg.cc | 29 ---
 .../23_containers/span/index_op_neg.cc| 29 ---
 4 files changed, 90 deletions(-)
 delete mode 100644 libstdc++-v3/testsuite/23_containers/span/back_neg.cc
 delete mode 100644 libstdc++-v3/testsuite/23_containers/span/front_neg.cc
 delete mode 100644 libstdc++-v3/testsuite/23_containers/span/index_op_neg.cc

diff --git a/libstdc++-v3/include/std/span b/libstdc++-v3/include/std/span
index f658adb04cf..1cdc0589ddb 100644
--- a/libstdc++-v3/include/std/span
+++ b/libstdc++-v3/include/std/span
@@ -264,7 +264,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr reference
   front() const noexcept
   {
-   static_assert(extent != 0);
__glibcxx_assert(!empty());
return *this->_M_ptr;
   }
@@ -272,7 +271,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr reference
   back() const noexcept
   {
-   static_assert(extent != 0);
__glibcxx_assert(!empty());
return *(this->_M_ptr + (size() - 1));
   }
@@ -280,7 +278,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr reference
   operator[](size_type __idx) const noexcept
   {
-   static_assert(extent != 0);
__glibcxx_assert(__idx < size());
return *(this->_M_ptr + __idx);
   }
diff --git a/libstdc++-v3/testsuite/23_containers/span/back_neg.cc 
b/libstdc++-v3/testsuite/23_containers/span/back_neg.cc
deleted file mode 100644
index c451ed10df8..000
--- a/libstdc++-v3/testsuite/23_containers/span/back_neg.cc
+++ /dev/null
@@ -1,29 +0,0 @@
-// Copyright (C) 2019-2020 Free Software Foundation, Inc.
-//
-// This file is part of the GNU ISO C++ Library.  This library is free
-// software; you can redistribute it and/or modify it under the
-// terms of the GNU General Public License as published by the
-// Free Software Foundation; either version 3, or (at your option)
-// any later version.
-
-// This library is distributed in the hope that it will be useful,
-// but WITHOUT ANY WARRANTY; without even the implied warranty of
-// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-// GNU General Public License for more details.
-
-// You should have received a copy of the GNU General Public License along
-// with this library; see the file COPYING3.  If not see
-// .
-
-// { dg-options "-std=gnu++2a" }
-// { dg-do compile { target c++2a } }
-
-#include 
-
-void
-test01()
-{
-  std::span s;
-  s.back(); // { dg-error "here" }
-}
-// { dg-error "static assertion failed" "" { target *-*-* } 0 }
diff --git a/libstdc++-v3/testsuite/23_containers/span/front_neg.cc 
b/libstdc++-v3/testsuite/23_containers/span/front_neg.cc
deleted file mode 100644
index 38f87aa2cd5..000
--- a/libstdc++-v3/testsuite/23_containers/span/front_neg.cc
+++ /dev/null
@@ -1,29 +0,0 @@
-// Copyright (C) 2019-2020 Free Software Foundation, Inc.
-//
-// This file is part of the GNU ISO C++ Library.  This library is free
-// software; you can redistribute it and/or modify it under the
-// terms of the GNU General Public License as published by the
-// Free Software Foundation; either version 3, or (at your option)
-// any later version.
-
-// This library is distributed in the hope that it will be useful,
-// but WITHOUT ANY WARRANTY; without even the implied warranty of
-// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-// GNU General Public License for more details.
-
-// You should have received a copy of the GNU General Public License along
-// with this library; see the file COPYING3.  If not see
-// .
-
-// { dg-options "-std=gnu++2a" }
-// { dg-do compile { target c++2a } }
-
-#include 
-
-void
-test01()
-{
-  std::span s;
-  s.front(); // { dg-error "here" }
-}
-// { dg-error "static assertion failed" "" { target *-*-* } 0 }
diff --git a/libstdc++-v3/testsuite/23_containers/span/index_op_neg.cc 
b/libstdc++-v3/testsuite/23_containers/span/index_op_neg.cc
deleted file mode 100644
index 1e8b2d8724e..000
--- a/libstdc++-v3/testsuite/23_containers/span/index_op_neg.cc
+++ /dev/null
@@ -1,29 +0,0 @@
-// Copyright (C) 2019-2020 Free Software F

c++: ts_lambda is not needed

2020-09-21 Thread Nathan Sidwell
I've been forced[*] to look at the bits of name-lookup I ran away from 
when reimplementing namespace-scope lookup at the beginning of this 
modules project.  Here's the first change in an expected series.


We don't need ts_lambda, as IDENTIFIER_LAMBDA_P is sufficient.  Killed 
thusly.


gcc/cp/
* decl.c (xref_tag_1): Use IDENTIFIER_LAMBDA_P to detect lambdas.
* lambda.c (begin_lambda_type): Use ts_current to push the tag.
* name-lookup.h (enum tag_scope): Drop ts_lambda.

pushed to trunk.

nathan

[*] I can only blame myself :)

--
Nathan Sidwell
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index af796499df7..bbecebe7a62 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -14857,10 +14857,10 @@ check_elaborated_type_specifier (enum tag_types tag_code,
   return type;
 }
 
-/* Lookup NAME in elaborate type specifier in scope according to
-   SCOPE and issue diagnostics if necessary.
-   Return *_TYPE node upon success, NULL_TREE when the NAME is not
-   found, and ERROR_MARK_NODE for type error.  */
+/* Lookup NAME of an elaborated type specifier according to SCOPE and
+   issue diagnostics if necessary.  Return *_TYPE node upon success,
+   NULL_TREE when the NAME is not found, and ERROR_MARK_NODE for type
+   error.  */
 
 static tree
 lookup_and_check_tag (enum tag_types tag_code, tree name,
@@ -14997,9 +14997,9 @@ xref_tag_1 (enum tag_types tag_code, tree name,
   /* In case of anonymous name, xref_tag is only called to
  make type node and push name.  Name lookup is not required.  */
   tree t = NULL_TREE;
-  if (scope != ts_lambda && !IDENTIFIER_ANON_P (name))
+  if (!IDENTIFIER_ANON_P (name))
 t = lookup_and_check_tag  (tag_code, name, scope, template_header_p);
-  
+
   if (t == error_mark_node)
 return error_mark_node;
 
@@ -15052,19 +15052,14 @@ xref_tag_1 (enum tag_types tag_code, tree name,
 	  error ("use of enum %q#D without previous declaration", name);
 	  return error_mark_node;
 	}
-  else
-	{
-	  t = make_class_type (code);
-	  TYPE_CONTEXT (t) = context;
-	  if (scope == ts_lambda)
-	{
-	  /* Mark it as a lambda type.  */
-	  CLASSTYPE_LAMBDA_EXPR (t) = error_mark_node;
-	  /* And push it into current scope.  */
-	  scope = ts_current;
-	}
-	  t = pushtag (name, t, scope);
-	}
+
+  t = make_class_type (code);
+  TYPE_CONTEXT (t) = context;
+  if (IDENTIFIER_LAMBDA_P (name))
+	/* Mark it as a lambda type right now.  Our caller will
+	   correct the value.  */
+	CLASSTYPE_LAMBDA_EXPR (t) = error_mark_node;
+  t = pushtag (name, t, scope);
 }
   else
 {
diff --git i/gcc/cp/lambda.c w/gcc/cp/lambda.c
index c94fe8edb8e..364a3e9f6b9 100644
--- i/gcc/cp/lambda.c
+++ w/gcc/cp/lambda.c
@@ -135,7 +135,7 @@ begin_lambda_type (tree lambda)
 
   /* Create the new RECORD_TYPE for this lambda.  */
   tree type = xref_tag (/*tag_code=*/record_type, name,
-			/*scope=*/ts_lambda, /*template_header_p=*/false);
+			/*scope=*/ts_current, /*template_header_p=*/false);
   if (type == error_mark_node)
 return error_mark_node;
 
diff --git i/gcc/cp/name-lookup.h w/gcc/cp/name-lookup.h
index 723fbb0008c..a0815e1a0ac 100644
--- i/gcc/cp/name-lookup.h
+++ w/gcc/cp/name-lookup.h
@@ -139,7 +139,6 @@ enum tag_scope {
 	   only, for friend class lookup
 	   according to [namespace.memdef]/3
 	   and [class.friend]/9.  */
-  ts_lambda = 3			/* Declaring a lambda closure.  */
 };
 
 struct GTY(()) cp_class_binding {


libsanitizer patch committed: Update for libbacktrace changes

2020-09-21 Thread Ian Lance Taylor via Gcc-patches
Recent changes to libbacktrace have introduced a few more globally
symbols.  These then need to be renamed in the libsanitizer copy.
This patch does that.  Tested by configuring
--with-build-config=bootstrap-asan and running a bootstrap.  Committed
to mainline as obvious.

Ian

* libbacktrace/backtrace-rename.h (backtrace_uncompress_lzma):
Define.
(backtrace_syminfo_to_full_callback): Define.
(backtrace_syminfo_to_full_error_callback): Define.
diff --git a/libsanitizer/libbacktrace/backtrace-rename.h 
b/libsanitizer/libbacktrace/backtrace-rename.h
index 2555fe508c2..2ec37a3307f 100644
--- a/libsanitizer/libbacktrace/backtrace-rename.h
+++ b/libsanitizer/libbacktrace/backtrace-rename.h
@@ -11,10 +11,13 @@
 #define backtrace_qsort __asan_backtrace_qsort
 #define backtrace_release_view __asan_backtrace_release_view
 #define backtrace_syminfo __asan_backtrace_syminfo
+#define backtrace_uncompress_lzma __asan_backtrace_uncompress_lzma
 #define backtrace_uncompress_zdebug __asan_backtrace_uncompress_zdebug
 #define backtrace_vector_finish __asan_backtrace_vector_finish
 #define backtrace_vector_grow __asan_backtrace_vector_grow
 #define backtrace_vector_release __asan_backtrace_vector_release
+#define backtrace_syminfo_to_full_callback 
__asan_backtrace_syminfo_to_full_callback
+#define backtrace_syminfo_to_full_error_callback 
__asan_backtrace_syminfo_to_full_error_callback
 
 #define cplus_demangle_builtin_types __asan_cplus_demangle_builtin_types
 #define cplus_demangle_fill_ctor __asan_cplus_demangle_fill_ctor


Re: libsanitizer patch committed: Update for libbacktrace changes

2020-09-21 Thread Ian Lance Taylor via Gcc-patches
On Mon, Sep 21, 2020 at 12:04 PM Ian Lance Taylor  wrote:
>
> Recent changes to libbacktrace have introduced a few more globally
> symbols.  These then need to be renamed in the libsanitizer copy.
> This patch does that.  Tested by configuring
> --with-build-config=bootstrap-asan and running a bootstrap.  Committed
> to mainline as obvious.
>
> Ian
>
> * libbacktrace/backtrace-rename.h (backtrace_uncompress_lzma):
> Define.
> (backtrace_syminfo_to_full_callback): Define.
> (backtrace_syminfo_to_full_error_callback): Define.


I forgot to mention that this fixes PR 97136.

Ian


Re: [PATCH] libstdc++: Remove overzealous static_asserts from std::span

2020-09-21 Thread Patrick Palka via Gcc-patches
On Mon, 21 Sep 2020, Patrick Palka wrote:

> For a span with empty static extent, we currently model the
> preconditions of front(), back(), and operator[] as if they were
> mandates, by using a static_assert to verify that extent != 0.  This
> causes us to incorrectly reject valid programs that instantiate these
> member functions but never call them.
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/span (span::front): Remove static_assert.
>   (span::back): Likewise.
>   (span::operator[]): Likewise.
>   * testsuite/23_containers/span/back_neg.cc: Remove.
>   * testsuite/23_containers/span/front_neg.cc: Remove.
>   * testsuite/23_containers/span/index_op_neg.cc: Remove.

Here's a version that rewrites rather than removes the testcases:

-- >8 --

Subject: [PATCH] libstdc++: Remove overzealous static_asserts from std::span

For a span with empty static extent, we currently model the
preconditions of front(), back(), and operator[] as if they were
mandates, by using a static_assert to verify that extent != 0.  This
causes us to incorrectly reject valid programs that instantiate these
member functions but would never call them at runtime.

libstdc++-v3/ChangeLog:

* include/std/span (span::front): Remove static_assert.
(span::back): Likewise.
(span::operator[]): Likewise.
* testsuite/23_containers/span/back_neg.cc: Rewrite to verify
that we check the preconditions of back() only when it's called.
* testsuite/23_containers/span/front_neg.cc: Likewise for
front().
* testsuite/23_containers/span/index_op_neg.cc: Likewise for
operator[].
---
 libstdc++-v3/include/std/span  |  3 ---
 .../testsuite/23_containers/span/back_neg.cc   | 14 ++
 .../testsuite/23_containers/span/front_neg.cc  | 14 ++
 .../testsuite/23_containers/span/index_op_neg.cc   | 14 ++
 4 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/libstdc++-v3/include/std/span b/libstdc++-v3/include/std/span
index f658adb04cf..1cdc0589ddb 100644
--- a/libstdc++-v3/include/std/span
+++ b/libstdc++-v3/include/std/span
@@ -264,7 +264,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr reference
   front() const noexcept
   {
-   static_assert(extent != 0);
__glibcxx_assert(!empty());
return *this->_M_ptr;
   }
@@ -272,7 +271,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr reference
   back() const noexcept
   {
-   static_assert(extent != 0);
__glibcxx_assert(!empty());
return *(this->_M_ptr + (size() - 1));
   }
@@ -280,7 +278,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr reference
   operator[](size_type __idx) const noexcept
   {
-   static_assert(extent != 0);
__glibcxx_assert(__idx < size());
return *(this->_M_ptr + __idx);
   }
diff --git a/libstdc++-v3/testsuite/23_containers/span/back_neg.cc 
b/libstdc++-v3/testsuite/23_containers/span/back_neg.cc
index c451ed10df8..f777edfa20c 100644
--- a/libstdc++-v3/testsuite/23_containers/span/back_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/span/back_neg.cc
@@ -20,10 +20,16 @@
 
 #include 
 
-void
-test01()
+constexpr bool
+test01(bool b)
 {
   std::span s;
-  s.back(); // { dg-error "here" }
+  if (b || !s.empty())
+s.back();
+  return true;
 }
-// { dg-error "static assertion failed" "" { target *-*-* } 0 }
+
+static_assert(test01(false));
+static_assert(test01(true)); // { dg-error "non-constant" }
+// { dg-error "assert" "" { target *-*-* } 0 }
+// { dg-prune-output "in 'constexpr' expansion" }
diff --git a/libstdc++-v3/testsuite/23_containers/span/front_neg.cc 
b/libstdc++-v3/testsuite/23_containers/span/front_neg.cc
index 38f87aa2cd5..14e5bc1e100 100644
--- a/libstdc++-v3/testsuite/23_containers/span/front_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/span/front_neg.cc
@@ -20,10 +20,16 @@
 
 #include 
 
-void
-test01()
+constexpr bool
+test01(bool b)
 {
   std::span s;
-  s.front(); // { dg-error "here" }
+  if (b || !s.empty())
+s.front();
+  return true;
 }
-// { dg-error "static assertion failed" "" { target *-*-* } 0 }
+
+static_assert(test01(false));
+static_assert(test01(true)); // { dg-error "non-constant" }
+// { dg-error "assert" "" { target *-*-* } 0 }
+// { dg-prune-output "in 'constexpr' expansion" }
diff --git a/libstdc++-v3/testsuite/23_containers/span/index_op_neg.cc 
b/libstdc++-v3/testsuite/23_containers/span/index_op_neg.cc
index 1e8b2d8724e..6a3bb8834b4 100644
--- a/libstdc++-v3/testsuite/23_containers/span/index_op_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/span/index_op_neg.cc
@@ -20,10 +20,16 @@
 
 #include 
 
-void
-test01()
+constexpr bool
+test01(bool b)
 {
   std::span s;
-  s[99]; // { dg-error "here" }
+  if (b || !s.empty())
+s[99];
+  return true;
 }
-// { dg-error "static assertion failed" "" { target *-*-* } 0 }
+
+static_assert(test01(false));
+static_assert(tes

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Richard Sandiford
Qing Zhao  writes:
>> But in cases where there is no underlying concept that can sensibly
>> be extracted out, it's OK if targets need to override the default
>> to get correct behaviour.
>
> Then, on the target that the default code is not right, and we haven’t 
> provide overridden implementation, what should we inform the end user about 
> this?
> The user might see the documentation about -fzero-call-used-regs in gcc 
> manual, and might try it on that specific target, but the default 
> implementation is not correct, how to deal this?

The point is that we're trying to implement this in a target-independent
way, like for most compiler features.  If the option doesn't work for a
particular target, then that's a bug like any other.  The most we can
reasonably do is:

(a) try to implement the feature in a way that uses all the appropriate
pieces of compiler infrastructure (what we've been discussing)

(b) add tests for the feature that run on all targets

It's possible that bugs could slip through even then.  But that's true
of anything.

Targets like x86 support many subtargets, many different compilation
modes, and many different compiler features (register asms, various
fancy function attributes, etc.).  So even after the option is
committed and is supposedly supported on x86, it's possible that
we'll find a bug in the feature on x86 itself.

I don't think anyone would suggest that we should warn the user that the
option might be buggy on x86 (it's initial target).  But I also don't
see any reason for believing that a bug on x86 is less likely than
a bug on other targets.

Thanks,
Richard


Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Qing Zhao via Gcc-patches



> On Sep 21, 2020, at 2:11 PM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>>> But in cases where there is no underlying concept that can sensibly
>>> be extracted out, it's OK if targets need to override the default
>>> to get correct behaviour.
>> 
>> Then, on the target that the default code is not right, and we haven’t 
>> provide overridden implementation, what should we inform the end user about 
>> this?
>> The user might see the documentation about -fzero-call-used-regs in gcc 
>> manual, and might try it on that specific target, but the default 
>> implementation is not correct, how to deal this?
> 
> The point is that we're trying to implement this in a target-independent
> way, like for most compiler features.  If the option doesn't work for a
> particular target, then that's a bug like any other.  The most we can
> reasonably do is:
> 
> (a) try to implement the feature in a way that uses all the appropriate
>pieces of compiler infrastructure (what we've been discussing)
> 
> (b) add tests for the feature that run on all targets
> 
> It's possible that bugs could slip through even then.  But that's true
> of anything.
> 
> Targets like x86 support many subtargets, many different compilation
> modes, and many different compiler features (register asms, various
> fancy function attributes, etc.).  So even after the option is
> committed and is supposedly supported on x86, it's possible that
> we'll find a bug in the feature on x86 itself.
> 
> I don't think anyone would suggest that we should warn the user that the
> option might be buggy on x86 (it's initial target).  But I also don't
> see any reason for believing that a bug on x86 is less likely than
> a bug on other targets.

Okay, then I will add the default implementation as you suggested. And also 
provide the overriden optimized implementation on X86. 

Let me know if you have further suggestion.

Qing
> 
> Thanks,
> Richard



Re: [r11-3315 Regression] FAIL: g++.dg/ext/timevar2.C -std=gnu++98 (test for excess errors) on Linux/x86_64 (-m64 -march=cascadelake)

2020-09-21 Thread Marek Polacek via Gcc-patches
On Mon, Sep 21, 2020 at 10:41:19AM -0700, sunil.k.pandey via Gcc-patches wrote:
> On Linux/x86_64,
> 
> 79f4e20dd1280e6a44736070b0d5213f9a8f85d4 is the first bad commit
> commit 79f4e20dd1280e6a44736070b0d5213f9a8f85d4
> Author: Martin Liska 
> Date:   Wed Sep 2 14:30:16 2020 +0200
> 
> Use SIZE_AMOUNT macro for GGC memory allocation numbers.
> 
> caused
> 
> FAIL: g++.dg/ext/timevar2.C  -std=gnu++14 (test for excess errors)
> FAIL: g++.dg/ext/timevar2.C  -std=gnu++98 (test for excess errors)

We should also prune N% from the output, I think.

Ok for trunk?  Tested by running timevar2.C a couple of dozen times.

* g++.dg/ext/timevar2.C: Also prune N%.
---
 gcc/testsuite/g++.dg/ext/timevar2.C | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/g++.dg/ext/timevar2.C 
b/gcc/testsuite/g++.dg/ext/timevar2.C
index 46c3e1b4794..7d3f1218314 100644
--- a/gcc/testsuite/g++.dg/ext/timevar2.C
+++ b/gcc/testsuite/g++.dg/ext/timevar2.C
@@ -4,6 +4,7 @@
 // { dg-prune-output "k" }
 // { dg-prune-output " 0 " }
 // { dg-prune-output "checks" }
+// { dg-prune-output "\[0-9\]+%" }
 
 namespace detail {
 namespace indirect_traits {}

base-commit: 762c16eba6b815090c56564a293cd059aea2e1d6
-- 
2.26.2



Re: [PATCH v2] c++: Implement -Wctad-maybe-unsupported.

2020-09-21 Thread Marek Polacek via Gcc-patches
On Mon, Sep 21, 2020 at 01:04:27AM -0400, Jason Merrill via Gcc-patches wrote:
> On 9/19/20 5:34 PM, Marek Polacek wrote:
> > I noticed that clang++ has this CTAD warning and thought that it might
> > be useful to have it.  From clang++: "Some style guides want to allow
> > using CTAD only on types that "opt-in"; i.e. on types that are designed
> > to support it and not just types that *happen* to work with it."
> 
> That's a weird name for the warning, but I guess if that's what clang calls
> it then we shouldn't change it.

Yes.  Naming is hard, but this seem like a particularly bad name.  I think
-Wctad-maybe-unintended would have been better.  But diverging would be worse
for users.

> > +  /* If CTAD succeeded but the type doesn't have any explicit deduction
> > + guides, this deduction might not be what the user intended.  */
> > +  if (call != error_mark_node
> > +  && !any_dguides_p
> > +  && warning (OPT_Wctad_maybe_unsupported,
> > + "%qT may not intend to support class template argument "
> > + "deduction", type))
> > +inform (input_location, "add a deduction guide to suppress this 
> > warning");
> 
> I think you want to avoid warning for types defined in a system header
> without -Wsystem-headers.

Ack, fixed.

> > +@item -Wctad-maybe-unsupported @r{(C++ and Objective-C++ only)}
> > +@opindex Wctad-maybe-unsupported
> > +@opindex Wno-ctad-maybe-unsupported
> > +Warn when performing class template argument deduction on a type with no
> > +deduction guides.  This warning will point out cases where CTAD succeeded
>^explicitly written
> > +only because the compiler synthesized the implicit deduction guides, which
> > +might not be what the programmer intended.  This warning can be suppressed
> > +with the following pattern:
> > +
> > +@smallexample
> > +struct allow_ctad_t;
> > +template  struct S @{
> > +  S(T) @{ @}
> > +@};
> > +S(allow_ctad_t) -> S; // will never be considered
> > +@end smallexample
> 
> This should mention the style guide motivation, and clarify that the
> suppression doesn't require the "allow_ctad_t" name.

Fixed both things.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
I noticed that clang++ has this CTAD warning and thought that it might
be useful to have it.  From clang++: "Some style guides want to allow
using CTAD only on types that "opt-in"; i.e. on types that are designed
to support it and not just types that *happen* to work with it."

So this warning warns when CTAD deduced a type, but the type does not
define any deduction guides.  In that case CTAD worked only because the
compiler synthesized the implicit deduction guides.  That might not be
intended.

It can be suppressed by adding a deduction guide that will never be
considered:

  struct allow_ctad_t;
  template  struct S { S(T) {} };
  S(allow_ctad_t) -> S;

This warning is off by default.  It doesn't warn when the type comes
from a system header unless -Wsystem-headers.

gcc/c-family/ChangeLog:

* c.opt (Wctad-maybe-unsupported): New option.

gcc/cp/ChangeLog:

* pt.c (deduction_guides_for): Add a bool parameter.  Set it.
(do_class_deduction): Warn when CTAD succeeds but the type doesn't
have any explicit deduction guides.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wctad-maybe-unsupported.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wctad-maybe-unsupported.C: New test.
* g++.dg/warn/Wctad-maybe-unsupported2.C: New test.
* g++.dg/warn/Wctad-maybe-unsupported3.C: New test.
* g++.dg/warn/Wctad-maybe-unsupported.h: New file.
---
 gcc/c-family/c.opt|  5 ++
 gcc/cp/pt.c   | 28 +-
 gcc/doc/invoke.texi   | 22 -
 .../g++.dg/warn/Wctad-maybe-unsupported.C | 88 +++
 .../g++.dg/warn/Wctad-maybe-unsupported.h |  4 +
 .../g++.dg/warn/Wctad-maybe-unsupported2.C|  6 ++
 .../g++.dg/warn/Wctad-maybe-unsupported3.C|  6 ++
 7 files changed, 154 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wctad-maybe-unsupported.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wctad-maybe-unsupported.h
 create mode 100644 gcc/testsuite/g++.dg/warn/Wctad-maybe-unsupported2.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wctad-maybe-unsupported3.C

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 7a61351bf84..da6c3e1a224 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -475,6 +475,11 @@ Wcpp
 C ObjC C++ ObjC++ CppReason(CPP_W_WARNING_DIRECTIVE)
 ; Documented in common.opt
 
+Wctad-maybe-unsupported
+C++ ObjC++ Var(warn_ctad_maybe_unsupported) Warning
+Warn when performing class template argument deduction on a type with no
+deduction guides.
+
 Wctor-dtor-privacy
 C++ ObjC++ Var(warn_ctor_dtor_privacy) Warning
 Warn when all constructors and destructors are private.
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index fe45de8d796..97d0c245f7e

[PATCH] c++: DR 1722: Make lambda to function pointer conv noexcept [PR90583]

2020-09-21 Thread Marek Polacek via Gcc-patches
DR 1722 clarifies that the conversion function from lambda to pointer to
function should be noexcept(true).

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/90583
DR 1722
* lambda.c (maybe_add_lambda_conv_op): Mark the conversion function
as noexcept.

gcc/testsuite/ChangeLog:

PR c++/90583
DR 1722
* g++.dg/cpp0x/lambda/lambda-conv14.C: New test.
---
 gcc/cp/lambda.c   |  2 ++
 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C | 10 ++
 2 files changed, 12 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index c94fe8edb8e..c34d68d3da3 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -1189,6 +1189,8 @@ maybe_add_lambda_conv_op (tree type)
   tree name = make_conv_op_name (rettype);
   tree thistype = cp_build_qualified_type (type, TYPE_QUAL_CONST);
   tree fntype = build_method_type_directly (thistype, rettype, void_list_node);
+  /* DR 1722: The conversion function should be noexcept.  */
+  fntype = build_exception_variant (fntype, noexcept_true_spec);
   tree convfn = build_lang_decl (FUNCTION_DECL, name, fntype);
   SET_DECL_LANGUAGE (convfn, lang_cplusplus);
   tree fn = convfn;
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C
new file mode 100644
index 000..869e0d51d2b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C
@@ -0,0 +1,10 @@
+// PR c++/90583
+// DR 1722: Lambda to function pointer conversion should be noexcept.
+// { dg-do compile { target c++11 } }
+
+void
+foo ()
+{
+  auto l = [](int){ return 42; };
+  static_assert(noexcept((int (*)(int))(l)), "");
+}

base-commit: b0c990f2661a2979c68c840781847efe27a0779b
-- 
2.26.2



Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Qing Zhao via Gcc-patches



> On Sep 21, 2020, at 2:22 PM, Qing Zhao via Gcc-patches 
>  wrote:
> 
> 
> 
>> On Sep 21, 2020, at 2:11 PM, Richard Sandiford  
>> wrote:
>> 
>> Qing Zhao  writes:
 But in cases where there is no underlying concept that can sensibly
 be extracted out, it's OK if targets need to override the default
 to get correct behaviour.
>>> 
>>> Then, on the target that the default code is not right, and we haven’t 
>>> provide overridden implementation, what should we inform the end user about 
>>> this?
>>> The user might see the documentation about -fzero-call-used-regs in gcc 
>>> manual, and might try it on that specific target, but the default 
>>> implementation is not correct, how to deal this?
>> 
>> The point is that we're trying to implement this in a target-independent
>> way, like for most compiler features.  If the option doesn't work for a
>> particular target, then that's a bug like any other.  The most we can
>> reasonably do is:
>> 
>> (a) try to implement the feature in a way that uses all the appropriate
>>   pieces of compiler infrastructure (what we've been discussing)
>> 
>> (b) add tests for the feature that run on all targets
>> 
>> It's possible that bugs could slip through even then.  But that's true
>> of anything.
>> 
>> Targets like x86 support many subtargets, many different compilation
>> modes, and many different compiler features (register asms, various
>> fancy function attributes, etc.).  So even after the option is
>> committed and is supposedly supported on x86, it's possible that
>> we'll find a bug in the feature on x86 itself.
>> 
>> I don't think anyone would suggest that we should warn the user that the
>> option might be buggy on x86 (it's initial target).  But I also don't
>> see any reason for believing that a bug on x86 is less likely than
>> a bug on other targets.
> 
> Okay, then I will add the default implementation as you suggested. And also 
> provide the overriden optimized implementation on X86. 

For X86, looks like that in addition to stack registers (st0 to st7), mask 
registers (k0 to k7) also do not need to be zeroed, and also “mm0 to mm7”  
should Not be zeroed too.

As I checked, MASK_REG_P and MMX_REG_P are x86 specific macros, can I use them 
in middle end similar as “STACK_REG_P”?

Qing
> 
> Let me know if you have further suggestion.
> 
> Qing
>> 
>> Thanks,
>> Richard



Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Segher Boessenkool
On Mon, Sep 21, 2020 at 09:13:58AM -0500, Qing Zhao wrote:
> > On Sep 18, 2020, at 5:51 PM, Segher Boessenkool 
> >  wrote:
> >> B.  Will provide a default definition in middle end to generate the 
> >> zeroing insn for selected registers. Then need to add a new target hook 
> >> “ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)”, same as A, X86 implementation 
> >> will be provided in my patch. 
> > 
> > Is this just to make the xor thing work?  i386 has a peephole to
> > transform the mov to a xor for this (and the backend could just handle
> > it in its mov patterns, maybe a peephole was easier for i386, no
> > idea).
> 
> You mean what’s the purpose of the new target hook 
> “ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)?
> 
> The purpose of this new target hook is for the target to delete some of the 
> call_used registers that should not be zeroed, for example, the stack 
> registers in X86. (St0-st7). 

Oh, I didn't see the _P.  Maybe give it a better name?  Also, a better
interface altogether, a call per hard register is a bit much (and easily
avoidable).

> For other platforms, there might be other call_used registers that should not 
> be zeroed. 

But you cannot *add* anything with this interface, and it cannot return
different results depending on which return insn this is.  It is not a
good abstraction IMO.


Segher


Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Qing Zhao via Gcc-patches



> On Sep 21, 2020, at 3:34 PM, Segher Boessenkool  
> wrote:
> 
> On Mon, Sep 21, 2020 at 09:13:58AM -0500, Qing Zhao wrote:
>>> On Sep 18, 2020, at 5:51 PM, Segher Boessenkool 
>>>  wrote:
 B.  Will provide a default definition in middle end to generate the 
 zeroing insn for selected registers. Then need to add a new target hook 
 “ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)”, same as A, X86 implementation 
 will be provided in my patch. 
>>> 
>>> Is this just to make the xor thing work?  i386 has a peephole to
>>> transform the mov to a xor for this (and the backend could just handle
>>> it in its mov patterns, maybe a peephole was easier for i386, no
>>> idea).
>> 
>> You mean what’s the purpose of the new target hook 
>> “ZERO_CALL_USED_REGNO_P(REGNO, GPR_ONLY)?
>> 
>> The purpose of this new target hook is for the target to delete some of the 
>> call_used registers that should not be zeroed, for example, the stack 
>> registers in X86. (St0-st7). 
> 
> Oh, I didn't see the _P.  Maybe give it a better name?  Also, a better
> interface altogether, a call per hard register is a bit much (and easily
> avoidable).
> 
>> For other platforms, there might be other call_used registers that should 
>> not be zeroed. 
> 
> But you cannot *add* anything with this interface, and it cannot return
> different results depending on which return insn this is.  It is not a
> good abstraction IMO.

This hook will not depend on which return insn.  It just check whether the 
specified register can be zeroed for this target, for example, it will exclude 
stack register (st0 to st7), MMX registers (mm0 to mm7) and mask registers (t0 
to t7) for X86 target from zeroing. 

The information depending on which return should be reflected in the data flow 
information,  which we can easily get from middle-end’s data flow analysis. 

I have added such target hook in the previous patch as: 

https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550018.html 


However, I got several comments on too much target specific details exposed 
unnecessary in the very beginning of the discussion. 

However, If we want to add a default implementation in the middle end as 
Richard suggested, this target hook might be necessary.

Qing

> 
> 
> Segher



[committed] adjust ipa-sra tests to avoid using array parameters

2020-09-21 Thread Martin Sebor via Gcc-patches

The SRA pass relies on the absence of function type attributes
to enable optimization like unused argument elision.  The intent
appears to be to avoid messing with the positions of arguments
that may be relied on by some type attributes.

The recent enhancement to detect out-of-bounds accesses by array
(including VLA) function arguments relies on the C front end
implicitly adding attribute access to the types if functions that
take such arguments.  This in turn interferes with a few SRA tests
that make use of the array notation in the declaration of the argv
array in main().

Since the use of the array notation is incidental to the purpose
of the SRA tests, to clean up the failures in r11- I have
committed the attached patch  to avoid incidental failures due
to implicit attribute access.

In case it's important to preserve the SRA optimization in the case
of array arguments that don't depend on their positions in the argument
list (only a subset of VLAs do so it seems like it would be nice to
keep it for the rest) I'm testing a followup enhancement to let SRA
recognize the new access attribute and let the optimization take
effect even its presence.  I'll post this patch once I'm done
testing it.

Martin
commit 05193687dde2e5a6337164182a1946b584acfada
Author: Martin Sebor 
Date:   Mon Sep 21 14:33:29 2020 -0600

Avoid incidental failures due to implicit attribute access.

gcc/testsuite/ChangeLog:

PR c/50584
* gcc.dg/ipa/ipa-sra-1.c: Use a plain pointer for argv instead of array.
* gcc.dg/ipa/ipa-sra-12.c: Same.
* gcc.dg/ipa/ipa-sra-13.c: Same.
* gcc.dg/ipa/ipa-sra-14.c: Same.
* gcc.dg/ipa/ipa-sra-15.c: Same.

diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c
index 4a22e3978f9..df7e356daf3 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c
@@ -24,7 +24,7 @@ ox (struct bovid cow)
 }
 
 int
-main (int argc, char *argv[])
+main (int argc, char **argv)
 {
   struct bovid cow;
 
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
index 4d9057e6353..0cc76bde319 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
@@ -34,7 +34,7 @@ bar (struct S s)
 }
 
 int
-main (int argc, char *argv[])
+main (int argc, char **argv)
 {
   struct S s;
 
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c
index 4d4ed74cfd6..e8751dad67f 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c
@@ -33,7 +33,7 @@ bar (struct S *s)
 }
 
 int
-main (int argc, char *argv[])
+main (int argc, char **argv)
 {
   struct S s;
 
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c
index 3ca302c77e2..75619c67b09 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c
@@ -43,7 +43,7 @@ bar (struct S s)
 }
 
 int
-main (int argc, char *argv[])
+main (int argc, char **argv)
 {
   struct S s;
 
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c
index 6c57c7bcebc..aa13a94c7c0 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c
@@ -45,7 +45,7 @@ bar (struct S *s, int rec)
 volatile int g;
 
 int
-main (int argc, char *argv[])
+main (int argc, char **argv)
 {
   struct S s;
 


[PATCH] config/i386/t-rtems: Change from mtune to march for multilibs

2020-09-21 Thread joel
From: Joel Sherrill 

* config/i386/t-rtems: Change from mtune to march when building
multilibs.  The mtune argument tunes or optimizes for a specific
CPU model but does not ensure the generated code is appropriate
for the CPU model. Prior to this patch, i386 compatible code
was always generated but tuned for later models.

Author: Michael Davidsaver 
---
 gcc/config/i386/t-rtems | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/i386/t-rtems b/gcc/config/i386/t-rtems
index 7626970..5f078c6 100644
--- a/gcc/config/i386/t-rtems
+++ b/gcc/config/i386/t-rtems
@@ -17,10 +17,10 @@
 # .
 #
 
-MULTILIB_OPTIONS = mtune=i486/mtune=pentium/mtune=pentiumpro msoft-float
+MULTILIB_OPTIONS = march=i486/march=pentium/march=pentiumpro msoft-float
 MULTILIB_DIRNAMES= m486 mpentium mpentiumpro soft-float
 MULTILIB_MATCHES = msoft-float=mno-80387
-MULTILIB_MATCHES += mtune?pentium=mtune?k6 mtune?pentiumpro=mtune?athlon
+MULTILIB_MATCHES += march?pentium=march?k6 march?pentiumpro=march?athlon
 MULTILIB_EXCEPTIONS = \
-mtune=pentium/*msoft-float* \
-mtune=pentiumpro/*msoft-float*
+march=pentium/*msoft-float* \
+march=pentiumpro/*msoft-float*
-- 
1.8.3.1



[PING][PATCH] correct handling of indices into arrays with elements larger than 1 (PR c++/96511)

2020-09-21 Thread Martin Sebor via Gcc-patches

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553906.html

(I'm working on rebasing the patch on top of the latest trunk which
has changed some of the same code but it'd be helpful to get a go-
ahead on substance the changes.  I don't expect the rebase to
require any substantive modifications.)

Martin

On 9/14/20 4:01 PM, Martin Sebor wrote:

On 9/4/20 11:14 AM, Jason Merrill wrote:

On 9/3/20 2:44 PM, Martin Sebor wrote:

On 9/1/20 1:22 PM, Jason Merrill wrote:

On 8/11/20 12:19 PM, Martin Sebor via Gcc-patches wrote:

-Wplacement-new handles array indices and pointer offsets the same:
by adjusting them by the size of the element.  That's correct for
the latter but wrong for the former, causing false positives when
the element size is greater than one.

In addition, the warning doesn't even attempt to handle arrays of
arrays.  I'm not sure if I forgot or if I simply didn't think of
it.

The attached patch corrects these oversights by replacing most
of the -Wplacement-new code with a call to compute_objsize which
handles all this correctly (plus more), and is also better tested.
But even compute_objsize has bugs: it trips up while converting
wide_int to offset_int for some pointer offset ranges.  Since
handling the C++ IL required changes in this area the patch also
fixes that.

For review purposes, the patch affects just the middle end.
The C++ diff pretty much just removes code from the front end.


The C++ changes are OK.


Thank you for looking at the rest as well.




-compute_objsize (tree ptr, int ostype, access_ref *pref,
-    bitmap *visited, const vr_values *rvals /* = NULL */)
+compute_objsize (tree ptr, int ostype, access_ref *pref, bitmap 
*visited,

+    const vr_values *rvals)


This reformatting seems unnecessary, and I prefer to keep the 
comment about the default argument.


This overload doesn't take a default argument.  (There was a stray
declaration of a similar function at the top of the file that had
one.  I've removed it.)


Ah, true.


-  if (!size || TREE_CODE (size) != INTEGER_CST)
-   return false;

 >...

You change some failure cases in compute_objsize to return success 
with a maximum range, while others continue to return failure.  This 
needs commentary about the design rationale.


This is too much for a comment in the code but the background is
this: compute_objsize initially returned the object size as a constant.
Recently, I have enhanced it to return a range to improve warnings for
allocated objects.  With that, a failure can be turned into success by
having the function set the range to that of the largest object.  That
should simplify the function's callers and could even improve
the detection of some invalid accesses.  Once this change is made
it might even be possible to change its return type to void.

The change that caught your eye is necessary to make the function
a drop-in replacement for the C++ front end code which makes this
same assumption.  Without it, a number of test cases that exercise
VLAs fail in g++.dg/warn/Wplacement-new-size-5.C.  For example:

   void f (int n)
   {
 char a[n];
 new (a - 1) int ();
   }

Changing any of the other places isn't necessary for existing tests
to pass (and I didn't want to introduce too much churn).  But I do
want to change the rest of the function along the same lines at some
point.


Please do change the other places to be consistent; better to have 
more churn than to leave the function half-updated.  That can be a 
separate patch if you prefer, but let's do it now rather than later.


I've made most of these changes in the other patch (also attached).
I'm quite happy with the result but it turned out to be a lot more
work than either of us expected, mostly due to the amount of testing.

I've left a couple of failing cases in place mainly as reminders
to handle them better (which means I also didn't change the caller
to avoid testing for failures).  I've also added TODO notes with
reminders to handle some of the new codes more completely.




+  special_array_member sam{ };


sam is always set by component_ref_size, so I don't think it's 
necessary to initialize it at the declaration.


I find initializing pass-by-pointer local variables helpful but
I don't insist on it.




@@ -187,7 +187,7 @@ decl_init_size (tree decl, bool min)
   tree last_type = TREE_TYPE (last);
   if (TREE_CODE (last_type) != ARRAY_TYPE
   || TYPE_SIZE (last_type))
-    return size;
+    return size ? size : TYPE_SIZE_UNIT (type);


This change seems to violate the comment for the function.


By my reading (and writing) the change is covered by the first
sentence:

    Returns the size of the object designated by DECL considering
    its initializer if it either has one or if it would not affect
    its size, ...


OK, I see it now.


It handles a number of cases in Wplacement-new-size.C fail that
construct a larger object in an extern declaration of a template,
like this:

   tem

Re: [PATCH] libstdc++: Remove overzealous static_asserts from std::span

2020-09-21 Thread Jonathan Wakely via Gcc-patches

On 21/09/20 15:07 -0400, Patrick Palka via Libstdc++ wrote:

On Mon, 21 Sep 2020, Patrick Palka wrote:


For a span with empty static extent, we currently model the
preconditions of front(), back(), and operator[] as if they were
mandates, by using a static_assert to verify that extent != 0.  This
causes us to incorrectly reject valid programs that instantiate these
member functions but never call them.

libstdc++-v3/ChangeLog:

* include/std/span (span::front): Remove static_assert.
(span::back): Likewise.
(span::operator[]): Likewise.
* testsuite/23_containers/span/back_neg.cc: Remove.
* testsuite/23_containers/span/front_neg.cc: Remove.
* testsuite/23_containers/span/index_op_neg.cc: Remove.


Here's a version that rewrites rather than removes the testcases:


OK fortrunk, thanks.

We might want to backport this too.




[Patch 0/5] rs6000, 128-bit Binary Integer Operations

2020-09-21 Thread Carl Love via Gcc-patches
Will, Segher:

The following is the updated patch set for the 128-bit Binary Integer
Operation.  I am reposting the entire set for completeness.  I have
noted in each patch the changes made since the previous version.  

The patches have been tested on Power 8 and Power 9 to ensure there are
no regression errors.  The new tests have been manually compiled and
run on mambo to ensure they work correctly.

Please review the patches and let me know if they are acceptable for
mainline.  Thanks.

   Carl Love



Re: [PATCH] c++: Return only in-scope tparms in keep_template_parm [PR95310]

2020-09-21 Thread Jason Merrill via Gcc-patches

On 9/19/20 3:49 PM, Patrick Palka wrote:

In the testcase below, the dependent specializations iter_reference_t
and iter_reference_t share the same tree due to specialization
caching.  So when find_template_parameters walks through the
requires-expression (as part of normalization), it sees and includes the
out-of-scope template parameter F in the list of template parameters
it found within the requires-expression (along with Out and N).

 From a correctness perspective this is harmless since the parameter mapping
routines only care about the level and index of each parameter, so F is
no different from Out in this sense.  (And it's also harmless that two
parameters in the parameter mapping have the same level and index.)

But having both Out and F in the parameter mapping is extra work for
hash_atomic_constrant, tsubst_parameter_mapping and get_mapped_args; and
it also means we print this irrelevant template parameter in the
testcase's diagnostics (via pp_cxx_parameter_mapping):

   in requirements with ‘Out o’ [with N = (const int&)&a; F = const int*; Out = 
const int*]

This patch makes keep_template_parm return only in-scope template
parameters by looking into ctx_parms for the corresponding in-scope one.

(That we sometimes print irrelevant template parameters in diagnostics is
also the subject of PR99 and PR66968, so the above diagnostic issue
could likely be fixed in a more general way, but this targeted fix to
keep_template_parm is perhaps worthwhile on its own.)

Bootstrapped and regtested on x86_64-pc-linux-gnu, and also tested on
cmcstl2 and range-v3.  Does this look OK for trunk?

gcc/cp/ChangeLog:

PR c++/95310
* pt.c (keep_template_parm): Adjust the given template parameter
to the corresponding in-scope one from ctx_parms.

gcc/testsuite/ChangeLog:

PR c++/95310
* g++.dg/concepts/diagnostic15.C: New test.
* g++.dg/cpp2a/concepts-ttp2.C: New test.
---
  gcc/cp/pt.c  | 19 +++
  gcc/testsuite/g++.dg/concepts/diagnostic15.C | 16 
  2 files changed, 35 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/concepts/diagnostic15.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index fe45de8d796..c2c70ff02b9 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10550,6 +10550,25 @@ keep_template_parm (tree t, void* data)
 BOUND_TEMPLATE_TEMPLATE_PARM itself.  */
  t = TREE_TYPE (TEMPLATE_TEMPLATE_PARM_TEMPLATE_DECL (t));
  
+  /* This template parameter might be an argument to a cached dependent

+ specalization that was formed earlier inside some other template, in which
+ case the parameter is not among the ones that are in-scope.  Look in
+ CTX_PARMS to find the corresponding in-scope template parameter and
+ always return that instead.  */
+  tree cparms = ftpi->ctx_parms;
+  while (TMPL_PARMS_DEPTH (cparms) > level)
+cparms = TREE_CHAIN (cparms);
+  gcc_assert (TMPL_PARMS_DEPTH (cparms) == level);
+  if (TREE_VEC_LENGTH (TREE_VALUE (cparms)))
+{
+  t = TREE_VALUE (TREE_VEC_ELT (TREE_VALUE (cparms), index));
+  /* As in template_parm_to_arg.  */
+  if (TREE_CODE (t) == TYPE_DECL || TREE_CODE (t) == TEMPLATE_DECL)
+   t = TREE_TYPE (t);
+  else
+   t = DECL_INITIAL (t);
+}


This seems like a useful separate function: given a parmlist and a 
single template parm (or index+level), return the corresponding parm 
from the parmlist.  Basically the reverse of canonical_type_parameter.


Jason


/* Arguments like const T yield parameters like const T. This means that
   a template-id like X would yield two distinct parameters:
   T and const T. Adjust types to their unqualified versions.  */
diff --git a/gcc/testsuite/g++.dg/concepts/diagnostic15.C 
b/gcc/testsuite/g++.dg/concepts/diagnostic15.C
new file mode 100644
index 000..3acd9f67968
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/diagnostic15.C
@@ -0,0 +1,16 @@
+// PR c++/95310
+// { dg-do compile { target concepts } }
+
+template 
+using iter_reference_t = decltype(*T{});
+
+template 
+struct result { using type = iter_reference_t; };
+
+template 
+concept indirectly_writable = requires(Out o) { // { dg-bogus "F =" }
+  iter_reference_t(*o) = N;
+};
+
+const int a = 0;
+static_assert(indirectly_writable); // { dg-error "assert" }





Re: [PATCH v2] c++: Implement -Wctad-maybe-unsupported.

2020-09-21 Thread Jason Merrill via Gcc-patches

On 9/21/20 3:57 PM, Marek Polacek wrote:

On Mon, Sep 21, 2020 at 01:04:27AM -0400, Jason Merrill via Gcc-patches wrote:

On 9/19/20 5:34 PM, Marek Polacek wrote:

I noticed that clang++ has this CTAD warning and thought that it might
be useful to have it.  From clang++: "Some style guides want to allow
using CTAD only on types that "opt-in"; i.e. on types that are designed
to support it and not just types that *happen* to work with it."


That's a weird name for the warning, but I guess if that's what clang calls
it then we shouldn't change it.


Yes.  Naming is hard, but this seem like a particularly bad name.  I think
-Wctad-maybe-unintended would have been better.  But diverging would be worse
for users.


+  /* If CTAD succeeded but the type doesn't have any explicit deduction
+ guides, this deduction might not be what the user intended.  */
+  if (call != error_mark_node
+  && !any_dguides_p
+  && warning (OPT_Wctad_maybe_unsupported,
+ "%qT may not intend to support class template argument "
+ "deduction", type))
+inform (input_location, "add a deduction guide to suppress this warning");


I think you want to avoid warning for types defined in a system header
without -Wsystem-headers.


Ack, fixed.


+@item -Wctad-maybe-unsupported @r{(C++ and Objective-C++ only)}
+@opindex Wctad-maybe-unsupported
+@opindex Wno-ctad-maybe-unsupported
+Warn when performing class template argument deduction on a type with no
+deduction guides.  This warning will point out cases where CTAD succeeded

^explicitly written

+only because the compiler synthesized the implicit deduction guides, which
+might not be what the programmer intended.  This warning can be suppressed
+with the following pattern:
+
+@smallexample
+struct allow_ctad_t;
+template  struct S @{
+  S(T) @{ @}
+@};
+S(allow_ctad_t) -> S; // will never be considered
+@end smallexample


This should mention the style guide motivation, and clarify that the
suppression doesn't require the "allow_ctad_t" name.


Fixed both things.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
I noticed that clang++ has this CTAD warning and thought that it might
be useful to have it.  From clang++: "Some style guides want to allow
using CTAD only on types that "opt-in"; i.e. on types that are designed
to support it and not just types that *happen* to work with it."

So this warning warns when CTAD deduced a type, but the type does not
define any deduction guides.  In that case CTAD worked only because the
compiler synthesized the implicit deduction guides.  That might not be
intended.

It can be suppressed by adding a deduction guide that will never be
considered:

   struct allow_ctad_t;
   template  struct S { S(T) {} };
   S(allow_ctad_t) -> S;

This warning is off by default.  It doesn't warn when the type comes
from a system header unless -Wsystem-headers.

gcc/c-family/ChangeLog:

* c.opt (Wctad-maybe-unsupported): New option.

gcc/cp/ChangeLog:

* pt.c (deduction_guides_for): Add a bool parameter.  Set it.
(do_class_deduction): Warn when CTAD succeeds but the type doesn't
have any explicit deduction guides.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wctad-maybe-unsupported.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wctad-maybe-unsupported.C: New test.
* g++.dg/warn/Wctad-maybe-unsupported2.C: New test.
* g++.dg/warn/Wctad-maybe-unsupported3.C: New test.
* g++.dg/warn/Wctad-maybe-unsupported.h: New file.
---
  gcc/c-family/c.opt|  5 ++
  gcc/cp/pt.c   | 28 +-
  gcc/doc/invoke.texi   | 22 -
  .../g++.dg/warn/Wctad-maybe-unsupported.C | 88 +++
  .../g++.dg/warn/Wctad-maybe-unsupported.h |  4 +
  .../g++.dg/warn/Wctad-maybe-unsupported2.C|  6 ++
  .../g++.dg/warn/Wctad-maybe-unsupported3.C|  6 ++
  7 files changed, 154 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wctad-maybe-unsupported.C
  create mode 100644 gcc/testsuite/g++.dg/warn/Wctad-maybe-unsupported.h
  create mode 100644 gcc/testsuite/g++.dg/warn/Wctad-maybe-unsupported2.C
  create mode 100644 gcc/testsuite/g++.dg/warn/Wctad-maybe-unsupported3.C

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 7a61351bf84..da6c3e1a224 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -475,6 +475,11 @@ Wcpp
  C ObjC C++ ObjC++ CppReason(CPP_W_WARNING_DIRECTIVE)
  ; Documented in common.opt
  
+Wctad-maybe-unsupported

+C++ ObjC++ Var(warn_ctad_maybe_unsupported) Warning
+Warn when performing class template argument deduction on a type with no
+deduction guides.
+
  Wctor-dtor-privacy
  C++ ObjC++ Var(warn_ctor_dtor_privacy) Warning
  Warn when all constructors and destructors are private.
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index fe45de8d796..97d0c245f7e 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@

Re: [PATCH] c++: DR 1722: Make lambda to function pointer conv noexcept [PR90583]

2020-09-21 Thread Jason Merrill via Gcc-patches

On 9/21/20 3:57 PM, Marek Polacek wrote:

DR 1722 clarifies that the conversion function from lambda to pointer to
function should be noexcept(true).

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


gcc/cp/ChangeLog:

PR c++/90583
DR 1722
* lambda.c (maybe_add_lambda_conv_op): Mark the conversion function
as noexcept.

gcc/testsuite/ChangeLog:

PR c++/90583
DR 1722
* g++.dg/cpp0x/lambda/lambda-conv14.C: New test.
---
  gcc/cp/lambda.c   |  2 ++
  gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C | 10 ++
  2 files changed, 12 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index c94fe8edb8e..c34d68d3da3 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -1189,6 +1189,8 @@ maybe_add_lambda_conv_op (tree type)
tree name = make_conv_op_name (rettype);
tree thistype = cp_build_qualified_type (type, TYPE_QUAL_CONST);
tree fntype = build_method_type_directly (thistype, rettype, 
void_list_node);
+  /* DR 1722: The conversion function should be noexcept.  */
+  fntype = build_exception_variant (fntype, noexcept_true_spec);
tree convfn = build_lang_decl (FUNCTION_DECL, name, fntype);
SET_DECL_LANGUAGE (convfn, lang_cplusplus);
tree fn = convfn;
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C
new file mode 100644
index 000..869e0d51d2b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv14.C
@@ -0,0 +1,10 @@
+// PR c++/90583
+// DR 1722: Lambda to function pointer conversion should be noexcept.
+// { dg-do compile { target c++11 } }
+
+void
+foo ()
+{
+  auto l = [](int){ return 42; };
+  static_assert(noexcept((int (*)(int))(l)), "");
+}

base-commit: b0c990f2661a2979c68c840781847efe27a0779b





libgo patch committed: Don't put golang.org packages in zstdpkglist.go

2020-09-21 Thread Ian Lance Taylor via Gcc-patches
This patch to libgo avoids putting golang.org packages in
zstdpkglist.go.  This ensures that internal/goroot.IsStandardPackage
does not treat golang.org packages as being in the standard library.
This fixes https://golang.org/issue/41499.  Committed to mainline and
GCC 10 branch.

Ian
507f392ade582742d2895cd8aea070f1a5b796a3
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 6b590f6fd94..f79a1f04201 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-6fd6418efb983827717f648a11bb5ca6fe93af30
+f2706d92d9560657333682a3de548f1f98e9f9b0
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/Makefile.am b/libgo/Makefile.am
index 9ce0cab9d6a..e3f08a2df39 100644
--- a/libgo/Makefile.am
+++ b/libgo/Makefile.am
@@ -620,7 +620,7 @@ s-zstdpkglist: Makefile
echo 'package goroot' > zstdpkglist.go.tmp
echo "" >> zstdpkglist.go.tmp
echo 'var stdpkg = map[string]bool{' >> zstdpkglist.go.tmp
-   echo $(libgo_go_objs) 'unsafe.lo' 'runtime/cgo.lo' | sed 
's|[a-z0-9_./]*_c\.lo||g' | sed 's|\([a-z0-9_./]*\)\.lo|"\1": true,|g' >> 
zstdpkglist.go.tmp
+   echo $(libgo_go_objs) 'unsafe.lo' 'runtime/cgo.lo' | sed 
's|[a-z0-9_./]*_c\.lo||g' | sed 's|golang\.org/[a-z0-9_./]*\.lo||g' | sed 
's|\([a-z0-9_./]*\)\.lo|"\1": true,|g' >> zstdpkglist.go.tmp
echo '}' >> zstdpkglist.go.tmp
$(SHELL) $(srcdir)/mvifdiff.sh zstdpkglist.go.tmp zstdpkglist.go
$(STAMP) $@


libgo patch committed: Recognize aixbigafMagic archives

2020-09-21 Thread Ian Lance Taylor via Gcc-patches
This libgo patch by Clément Chigot recognizes aixbigafMagic archives.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
1b785ecdc817ee14417beb1fd7389622fd8d035f
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index f79a1f04201..d8db888e4b6 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-f2706d92d9560657333682a3de548f1f98e9f9b0
+6f309797e4f7eed635950687e902a294126e6fc6
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/go/internal/gccgoimporter/importer.go 
b/libgo/go/go/internal/gccgoimporter/importer.go
index ff484a72fc9..391477d5a73 100644
--- a/libgo/go/go/internal/gccgoimporter/importer.go
+++ b/libgo/go/go/internal/gccgoimporter/importer.go
@@ -198,7 +198,7 @@ func GetImporter(searchpaths []string, initmap 
map[*types.Package]InitData) Impo
return
}
 
-   if magics == archiveMagic {
+   if magics == archiveMagic || magics == aixbigafMagic {
reader, err = arExportData(reader)
if err != nil {
return


[committed] libstdc++: Fix constraints for drop_view::begin() const [LWG 3482]

2020-09-21 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/std/ranges (drop_view::begin()): Adjust constraints
to match the correct condition for O(1) ranges::next (LWG 3482).
* testsuite/std/ranges/adaptors/drop.cc: Check that iterator is
cached for non-sized_range.

Tested powerpc64le-linux. Committed to trunk.

This should be backported to gcc-10.


commit aecea4158f4e547af349657a3d16cb031a30ec3b
Author: Jonathan Wakely 
Date:   Mon Sep 21 23:43:25 2020

libstdc++: Fix constraints for drop_view::begin() const [LWG 3482]

libstdc++-v3/ChangeLog:

* include/std/ranges (drop_view::begin()): Adjust constraints
to match the correct condition for O(1) ranges::next (LWG 3482).
* testsuite/std/ranges/adaptors/drop.cc: Check that iterator is
cached for non-sized_range.

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 005e89f94b2..1bf894dd570 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -2238,7 +2238,10 @@ namespace views
   _Vp _M_base = _Vp();
   range_difference_t<_Vp> _M_count = 0;
 
-  static constexpr bool _S_needs_cached_begin = !random_access_range<_Vp>;
+  // ranges::next(begin(base), count, end(base)) is O(1) if _Vp satisfies
+  // both random_access_range and sized_range. Otherwise, cache its result.
+  static constexpr bool _S_needs_cached_begin
+   = !(random_access_range && sized_range);
   [[no_unique_address]]
__detail::__maybe_present_t<_S_needs_cached_begin,
__detail::_CachedPosition<_Vp>>
@@ -2260,9 +2263,12 @@ namespace views
   base() &&
   { return std::move(_M_base); }
 
+  // This overload is disabled for simple views with constant-time begin().
   constexpr auto
-  begin() requires (!(__detail::__simple_view<_Vp>
- && random_access_range<_Vp>))
+  begin()
+   requires (!(__detail::__simple_view<_Vp>
+   && random_access_range
+   && sized_range))
   {
if constexpr (_S_needs_cached_begin)
  if (_M_cached_begin._M_has_value())
@@ -2275,8 +2281,11 @@ namespace views
return __it;
   }
 
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 3482. drop_view's const begin should additionally require sized_range
   constexpr auto
-  begin() const requires random_access_range
+  begin() const
+   requires random_access_range && sized_range
   {
return ranges::next(ranges::begin(_M_base), _M_count,
ranges::end(_M_base));
diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/drop.cc 
b/libstdc++-v3/testsuite/std/ranges/adaptors/drop.cc
index 3c82caea772..5fe94b67507 100644
--- a/libstdc++-v3/testsuite/std/ranges/adaptors/drop.cc
+++ b/libstdc++-v3/testsuite/std/ranges/adaptors/drop.cc
@@ -26,6 +26,7 @@
 using __gnu_test::test_range;
 using __gnu_test::forward_iterator_wrapper;
 using __gnu_test::bidirectional_iterator_wrapper;
+using __gnu_test::random_access_iterator_wrapper;
 
 namespace ranges = std::ranges;
 namespace views = ranges::views;
@@ -123,21 +124,6 @@ struct test_wrapper : forward_iterator_wrapper
 forward_iterator_wrapper::operator++();
 return *this;
   }
-
-  test_wrapper
-  operator--(int)
-  {
-auto tmp = *this;
---*this;
-return tmp;
-  }
-
-  test_wrapper&
-  operator--()
-  {
-forward_iterator_wrapper::operator--();
-return *this;
-  }
 };
 
 void
@@ -146,11 +132,132 @@ test07()
   int x[] = {1,2,3,4,5};
   test_range rx(x);
   auto v = rx | views::drop(3);
+  VERIFY( test_wrapper::increment_count == 0 );
+  (void) v.begin();
+  VERIFY( test_wrapper::increment_count == 3 );
+  (void) v.begin();
+  VERIFY( test_wrapper::increment_count == 3 );
   VERIFY( ranges::equal(v, (int[]){4,5}) );
+  VERIFY( test_wrapper::increment_count == 5 );
   VERIFY( ranges::equal(v, (int[]){4,5}) );
   VERIFY( test_wrapper::increment_count == 7 );
 }
 
+template
+struct ra_test_wrapper : random_access_iterator_wrapper
+{
+  static inline int increment_count = 0;
+
+  using random_access_iterator_wrapper::random_access_iterator_wrapper;
+
+  ra_test_wrapper() : random_access_iterator_wrapper()
+  { }
+
+  ra_test_wrapper
+  operator++(int)
+  {
+auto tmp = *this;
+++*this;
+return tmp;
+  }
+
+  ra_test_wrapper&
+  operator++()
+  {
+++increment_count;
+random_access_iterator_wrapper::operator++();
+return *this;
+  }
+
+  ra_test_wrapper
+  operator--(int)
+  {
+auto tmp = *this;
+--*this;
+return tmp;
+  }
+
+  ra_test_wrapper&
+  operator--()
+  {
+random_access_iterator_wrapper::operator--();
+return *this;
+  }
+
+  ra_test_wrapper&
+  operator+=(std::ptrdiff_t n)
+  {
+random_access_iterator_wrapper::operator+=(n);
+return *this;
+  }
+
+  ra_test_wrapper&
+  operator-=(std::ptrdiff_t n)
+  { return *this += -n; }
+
+  ra_test_

[committed] libstdc++: Use __builtin_expect in __glibcxx_assert

2020-09-21 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/bits/c++config (__replacement_assert): Add noreturn
attribute.
(__glibcxx_assert_impl): Use __builtin_expect to hint that the
assertion is expected to pass.

Tested powerpc64le-linux. Committed to trunk.

commit 7db5967f1050eb2b45e920b13d495d92ba4f16f4
Author: Jonathan Wakely 
Date:   Mon Sep 21 23:43:25 2020

libstdc++: Use __builtin_expect in __glibcxx_assert

libstdc++-v3/ChangeLog:

* include/bits/c++config (__replacement_assert): Add noreturn
attribute.
(__glibcxx_assert_impl): Use __builtin_expect to hint that the
assertion is expected to pass.

diff --git a/libstdc++-v3/include/bits/c++config 
b/libstdc++-v3/include/bits/c++config
index badf9d01a04..860bf6dbcb3 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -468,7 +468,8 @@ namespace std
 {
   // Avoid the use of assert, because we're trying to keep the 
   // include out of the mix.
-  extern "C++" inline void
+  extern "C++" _GLIBCXX_NORETURN
+  inline void
   __replacement_assert(const char* __file, int __line,
   const char* __function, const char* __condition)
   {
@@ -478,7 +479,7 @@ namespace std
   }
 }
 #define __glibcxx_assert_impl(_Condition) \
-  if (!bool(_Condition))  \
+  if (__builtin_expect(!bool(_Condition), false)) \
 std::__replacement_assert(__FILE__, __LINE__, __PRETTY_FUNCTION__, \
  #_Condition)
 #endif


[committed] analyzer: decls are not on the heap

2020-09-21 Thread David Malcolm via Gcc-patches
Whilst debugging the remaining state explosion in PR analyzer/93355
I noticed that half of the states at an exploding program point had:
  'malloc': {'&buf': 'non-heap'}
whereas the other half didn't, presumably depending on whether the path
to each enode had used this local buffer:
  char buf[400];

This patch tweaks malloc_state_machine::get_default_state to be smarter
about this, so that we can implicitly treat pointers to decls as
non-heap, preventing pointless differences between sm_state_map
instances.  With that, all of the states in question have equal (empty)
malloc sm-state - though the state explosion continues for other reasons.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as 15e7b93ba4256884c90198c678ed7eded4e73464.

gcc/analyzer/ChangeLog:
PR analyzer/93355
* sm-malloc.cc (malloc_state_machine::get_default_state): Look at
the base region when considering pointers.  Treat pointers to
decls as being non-heap.
---
 gcc/analyzer/sm-malloc.cc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
index 90d1da14586..12b2383e4a7 100644
--- a/gcc/analyzer/sm-malloc.cc
+++ b/gcc/analyzer/sm-malloc.cc
@@ -183,7 +183,9 @@ public:
 if (const region_svalue *ptr = sval->dyn_cast_region_svalue ())
   {
const region *reg = ptr->get_pointee ();
-   if (reg->get_kind () == RK_STRING)
+   const region *base_reg = reg->get_base_region ();
+   if (base_reg->get_kind () == RK_DECL
+   || base_reg->get_kind () == RK_STRING)
  return m_non_heap;
   }
 return m_start;
-- 
2.26.2



[committed] analyzer: fix ICE on bogus decl of memset [PR97130]

2020-09-21 Thread David Malcolm via Gcc-patches
Verify that arguments are pointers before calling handling code
that calls deref_rvalue on them.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-3341-g1e19ecd79b45af6df87a6869d1936b857c9f71fc.

gcc/analyzer/ChangeLog:
PR analyzer/97130
* region-model-impl-calls.cc (call_details::get_arg_type): New.
* region-model.cc (region_model::on_call_pre): Check that the
initial arg is a pointer before calling impl_call_memset and
impl_call_strlen.
* region-model.h (call_details::get_arg_type): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/97130
* gcc.dg/analyzer/pr97130.c: New test.
---
 gcc/analyzer/region-model-impl-calls.cc |  8 
 gcc/analyzer/region-model.cc|  6 --
 gcc/analyzer/region-model.h |  1 +
 gcc/testsuite/gcc.dg/analyzer/pr97130.c | 10 ++
 4 files changed, 23 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr97130.c

diff --git a/gcc/analyzer/region-model-impl-calls.cc 
b/gcc/analyzer/region-model-impl-calls.cc
index 6582ffb3c95..423f74a4152 100644
--- a/gcc/analyzer/region-model-impl-calls.cc
+++ b/gcc/analyzer/region-model-impl-calls.cc
@@ -103,6 +103,14 @@ call_details::get_arg_tree (unsigned idx) const
   return gimple_call_arg (m_call, idx);
 }
 
+/* Get the type of argument IDX.  */
+
+tree
+call_details::get_arg_type (unsigned idx) const
+{
+  return TREE_TYPE (gimple_call_arg (m_call, idx));
+}
+
 /* Get argument IDX at the callsite as an svalue.  */
 
 const svalue *
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 1312391557d..6f04904a74e 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -737,12 +737,14 @@ region_model::on_call_pre (const gcall *call, 
region_model_context *ctxt)
  /* No side-effects (tracking stream state is out-of-scope
 for the analyzer).  */
}
-  else if (is_named_call_p (callee_fndecl, "memset", call, 3))
+  else if (is_named_call_p (callee_fndecl, "memset", call, 3)
+  && POINTER_TYPE_P (cd.get_arg_type (0)))
{
  impl_call_memset (cd);
  return false;
}
-  else if (is_named_call_p (callee_fndecl, "strlen", call, 1))
+  else if (is_named_call_p (callee_fndecl, "strlen", call, 1)
+  && POINTER_TYPE_P (cd.get_arg_type (0)))
{
  if (impl_call_strlen (cd))
return false;
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 1bb9798ae58..4859df369cf 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -2482,6 +2482,7 @@ public:
   bool maybe_set_lhs (const svalue *result) const;
 
   tree get_arg_tree (unsigned idx) const;
+  tree get_arg_type (unsigned idx) const;
   const svalue *get_arg_svalue (unsigned idx) const;
 
   void dump_to_pp (pretty_printer *pp, bool simple) const;
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr97130.c 
b/gcc/testsuite/gcc.dg/analyzer/pr97130.c
new file mode 100644
index 000..f437b763c94
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr97130.c
@@ -0,0 +1,10 @@
+/* { dg-additional-options "-Wno-builtin-declaration-mismatch" } */
+
+void *
+memset (int, int, __SIZE_TYPE__);
+
+void
+mp (int xl)
+{
+  memset (xl, 0, sizeof xl);
+}
-- 
2.26.2



Go patch committed: Finalize methods for type aliases of struct types

2020-09-21 Thread Ian Lance Taylor via Gcc-patches
This patch to the Go frontend finalizes methods for type aliases of
struct types.  Previously we would finalize the methods of the alias
type itself, but since its a type alias we really need to finalize the
methods of the aliased type.

This patch also handles method expressions of unnamed struct types.

The test case for both is https://golang.org/cl/251168.

This fixes https://golang.org/issue/38125.

Bootstrapped and tested on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
4e4b1f97342d9aa032591fb868318d8bead8dafa
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index d8db888e4b6..e4f8fac5ab3 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-6f309797e4f7eed635950687e902a294126e6fc6
+a59167c29d6ad2ddf533b3a12b365f72df0e1476
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 8bbc557c65f..0350e51d3a6 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -14529,21 +14529,19 @@ Selector_expression::lower_method_expression(Gogo* 
gogo)
   is_pointer = true;
   type = type->points_to();
 }
-  Named_type* nt = type->named_type();
-  if (nt == NULL)
-{
-  go_error_at(location,
-  ("method expression requires named type or "
-   "pointer to named type"));
-  return Expression::make_error(location);
-}
 
+  Named_type* nt = type->named_type();
+  Struct_type* st = type->struct_type();
   bool is_ambiguous;
-  Method* method = nt->method_function(name, &is_ambiguous);
+  Method* method = NULL;
+  if (nt != NULL)
+method = nt->method_function(name, &is_ambiguous);
+  else if (st != NULL)
+method = st->method_function(name, &is_ambiguous);
   const Typed_identifier* imethod = NULL;
   if (method == NULL && !is_pointer)
 {
-  Interface_type* it = nt->interface_type();
+  Interface_type* it = type->interface_type();
   if (it != NULL)
imethod = it->find_method(name);
 }
@@ -14551,16 +14549,28 @@ Selector_expression::lower_method_expression(Gogo* 
gogo)
   if ((method == NULL && imethod == NULL)
   || (left_type->named_type() != NULL && left_type->points_to() != NULL))
 {
-  if (!is_ambiguous)
-   go_error_at(location, "type %<%s%s%> has no method %<%s%>",
-is_pointer ? "*" : "",
-nt->message_name().c_str(),
-Gogo::message_name(name).c_str());
+  if (nt != NULL)
+   {
+ if (!is_ambiguous)
+   go_error_at(location, "type %<%s%s%> has no method %<%s%>",
+   is_pointer ? "*" : "",
+   nt->message_name().c_str(),
+   Gogo::message_name(name).c_str());
+ else
+   go_error_at(location, "method %<%s%s%> is ambiguous in type %<%s%>",
+   Gogo::message_name(name).c_str(),
+   is_pointer ? "*" : "",
+   nt->message_name().c_str());
+   }
   else
-   go_error_at(location, "method %<%s%s%> is ambiguous in type %<%s%>",
-Gogo::message_name(name).c_str(),
-is_pointer ? "*" : "",
-nt->message_name().c_str());
+   {
+ if (!is_ambiguous)
+   go_error_at(location, "type has no method %<%s%>",
+   Gogo::message_name(name).c_str());
+ else
+   go_error_at(location, "method %<%s%> is ambiguous",
+   Gogo::message_name(name).c_str());
+   }
   return Expression::make_error(location);
 }
 
@@ -14657,7 +14667,7 @@ Selector_expression::lower_method_expression(Gogo* gogo)
   Expression* ve = Expression::make_var_reference(vno, location);
   Expression* bm;
   if (method != NULL)
-bm = Type::bind_field_or_method(gogo, nt, ve, name, location);
+bm = Type::bind_field_or_method(gogo, type, ve, name, location);
   else
 bm = Expression::make_interface_field_reference(ve, name, location);
 
diff --git a/gcc/go/gofrontend/gogo.cc b/gcc/go/gofrontend/gogo.cc
index 82d4c1fd54d..aef1c47d26e 100644
--- a/gcc/go/gofrontend/gogo.cc
+++ b/gcc/go/gofrontend/gogo.cc
@@ -3508,6 +3508,10 @@ Finalize_methods::type(Type* t)
 case Type::TYPE_NAMED:
   {
Named_type* nt = t->named_type();
+
+   if (nt->is_alias())
+ return TRAVERSE_CONTINUE;
+
Type* rt = nt->real_type();
if (rt->classification() != Type::TYPE_STRUCT)
  {


Re: [PATCH] gcov: fix TOPN streaming from shared libraries

2020-09-21 Thread Sergei Trofimovich via Gcc-patches
On Mon, 21 Sep 2020 20:38:07 +0300 (MSK)
Alexander Monakov  wrote:

> On Mon, 21 Sep 2020, Martin Liška wrote:
> 
> > On 9/6/20 1:24 PM, Sergei Trofimovich wrote:  
> > > From: Sergei Trofimovich 
> > > 
> > > Before the change gcc did not stream correctly TOPN counters
> > > if counters belonged to a non-local shared object.
> > > 
> > > As a result zero-section optimization generated TOPN sections
> > > in a form not recognizable by '__gcov_merge_topn'.
> > > 
> > > The problem happens because in a case of multiple shared objects
> > > '__gcov_merge_topn' function is present in address space multiple
> > > times (once per each object).
> > > 
> > > The fix is to never rely on function address and predicate on TOPN
> > > counter types.  
> > 
> > Hello.
> > 
> > Thank you for the analysis! I think it's the correct fix and it's probably
> > similar to what we used to see for indirect_call_tuple.
> > 
> > @Alexander: Am I right?  
> 
> Yes, analysis presented by Sergei in Bugzilla looks correct. Pedantically I
> wouldn't say the indirect call issue was similar: it's a different gotcha
> arising from mixing static and dynamic linking. There we had some symbols
> preempted by the main executable (but not all symbols), here we have lack
> of preemption/unification as relevant libgcov symbol is hidden.
> 
> I cannot judge if the fix is correct (don't know the code that well) but it
> looks reasonable. If you could come up with a clearer wording for the new
> comment it would be nice, I struggled to understand it.

Yeah, I agree the comment is very misleading. The code is already very clear
about special casing of TOPN counters. How about dropping the comment?

v2:

From 300585164f0a719a3a283c8da3a4061615f6da3a Mon Sep 17 00:00:00 2001
From: Sergei Trofimovich 
Date: Sun, 6 Sep 2020 12:13:54 +0100
Subject: [PATCH v2] gcov: fix TOPN streaming from shared libraries

Before the change gcc did not stream correctly TOPN counters
if counters belonged to a non-local shared object.

As a result zero-section optimization generated TOPN sections
in a form not recognizable by '__gcov_merge_topn'.

The problem happens because in a case of multiple shared objects
'__gcov_merge_topn' function is present in address space multiple
times (once per each object).

The fix is to never rely on function address and predicate on TOPN
counter types.

libgcc/ChangeLog:

PR gcov-profile/96913
* libgcov-driver.c (write_one_data): Avoid function pointer
comparison in TOP streaming decision.
---
 libgcc/libgcov-driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
index 58914268d4e..e53e4dc392a 100644
--- a/libgcc/libgcov-driver.c
+++ b/libgcc/libgcov-driver.c
@@ -424,7 +424,7 @@ write_one_data (const struct gcov_info *gi_ptr,

  n_counts = ci_ptr->num;

- if (gi_ptr->merge[t_ix] == __gcov_merge_topn)
+ if (t_ix == GCOV_COUNTER_V_TOPN || t_ix == GCOV_COUNTER_V_INDIR)
write_top_counters (ci_ptr, t_ix, n_counts);
  else
{
-- 

  Sergei
>From 300585164f0a719a3a283c8da3a4061615f6da3a Mon Sep 17 00:00:00 2001
From: Sergei Trofimovich 
Date: Sun, 6 Sep 2020 12:13:54 +0100
Subject: [PATCH v2] gcov: fix TOPN streaming from shared libraries

Before the change gcc did not stream correctly TOPN counters
if counters belonged to a non-local shared object.

As a result zero-section optimization generated TOPN sections
in a form not recognizable by '__gcov_merge_topn'.

The problem happens because in a case of multiple shared objects
'__gcov_merge_topn' function is present in address space multiple
times (once per each object).

The fix is to never rely on function address and predicate on TOPN
counter types.

libgcc/ChangeLog:

	PR gcov-profile/96913
	* libgcov-driver.c (write_one_data): Avoid function pointer
	comparison in TOP streaming decision.
---
 libgcc/libgcov-driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
index 58914268d4e..e53e4dc392a 100644
--- a/libgcc/libgcov-driver.c
+++ b/libgcc/libgcov-driver.c
@@ -424,7 +424,7 @@ write_one_data (const struct gcov_info *gi_ptr,
 
 	  n_counts = ci_ptr->num;
 
-	  if (gi_ptr->merge[t_ix] == __gcov_merge_topn)
+	  if (t_ix == GCOV_COUNTER_V_TOPN || t_ix == GCOV_COUNTER_V_INDIR)
 	write_top_counters (ci_ptr, t_ix, n_counts);
 	  else
 	{
-- 
2.28.0



Re: New modref/ipa_modref optimization passes

2020-09-21 Thread David Malcolm via Gcc-patches
On Sun, 2020-09-20 at 19:30 +0200, Jan Hubicka wrote:
> > On Sun, 2020-09-20 at 00:32 +0200, Jan Hubicka wrote:
> > > Hi,
> > > this is cleaned up version of the patch.  I removed unfinished
> > > bits,
> > > fixed
> > > propagation, cleaned it up and fixed fallout.
> > 
> > [...]
> > 
> > > While there are several areas for improvements but I think it is
> > > not
> > > in shape
> > > for mainline and rest can be dealt with incrementally.
> > 
> > FWIW I think you typoed:
> >   "not in shape for mainline"
> > when you meant:
> >   "now in shape for mainline"
> > given...
> 
> Yep, sorry for that :)

I've started seeing crashes in the jit testsuite even with trivial
inputs, which are happening at pass_modref::~pass_modref at:

772 ggc_delete (summaries);

on the first in-process iteration of the code, with:

(gdb) p summaries
$3 = (fast_function_summary *) 0x0

I'm still investigating (but may have to call halt for the night), but
this could be an underlying issue with the new passes; the jit
testsuite runs with the equivalent of:

--param=ggc-min-expand=0 --param=ggc-min-heapsize=0

throughout to shake out GC issues (to do a full collection at each GC
opportunity).

Was this code tested with the jit?  Do you see issues in cc1 if you set
those params?  Anyone else seeing "random" crashes?

Thanks
Dave




Re: [PATCH] aarch64: Add extend-as-extract-with-shift pattern [PR96998]

2020-09-21 Thread Segher Boessenkool
Hi!

So, I tested this patch.  The test builds Linux for all targets, and the
number reported here is just binary size (usually a good indicator for
combine effectiveness).  C0 is the unmodified compiler, C1 is with your
patch.  A size of 0 means it did not build.

C0C1
   alpha   6403469  100.000%
 arc 0 0
 arm  10196358  100.000%
   arm64 0  20228766
   armhf  15042594  100.000%
 c6x   2496218  100.000%
csky 0 0
   h8300   1217198  100.000%
i386  11966700  100.000%
ia64  18814277  100.000%
m68k   3856350  100.000%
  microblaze   5864258  100.000%
mips   9142108  100.000%
  mips64   7344744  100.000%
   nds32 0 0
   nios2   3909477  100.000%
openrisc   4554446  100.000%
  parisc   7721195  100.000%
parisc64 0 0
 powerpc  10447477  100.000%
   powerpc64  22257111  100.000%
 powerpc64le  19292786  100.000%
 riscv32   1630934  100.000%
 riscv64   7628058  100.000%
s390  15173928  100.000%
  sh   3410671  100.000%
 shnommu   1685616  100.000%
   sparc   4737096  100.000%
 sparc64   7167122  100.000%
  x86_64  19718928  100.000%
  xtensa   2639363  100.000%

So, there is no difference for most targets (I checked some targets and
there really is no difference).  The only exception is aarch64 (which
the kernel calls "arm64"): the unpatched compiler ICEs!  (At least three
times, even).

during RTL pass: reload
/home/segher/src/kernel/kernel/cgroup/cgroup.c: In function 'rebind_subsystems':
/home/segher/src/kernel/kernel/cgroup/cgroup.c:1777:1: internal compiler error: 
in lra_set_insn_recog_data, at lra.c:1004
 1777 | }
  | ^
0x1096215f lra_set_insn_recog_data(rtx_insn*)
/home/segher/src/gcc/gcc/lra.c:1004
0x109625d7 lra_get_insn_recog_data
/home/segher/src/gcc/gcc/lra-int.h:488
0x109625d7 lra_update_insn_regno_info(rtx_insn*)
/home/segher/src/gcc/gcc/lra.c:1625
0x10962d03 lra_update_insn_regno_info(rtx_insn*)
/home/segher/src/gcc/gcc/lra.c:1623
0x10962d03 lra_push_insn_1
/home/segher/src/gcc/gcc/lra.c:1780
[etc]

This means LRA found an unrecognised insn; and that insn is

(insn 817 804 818 21 (set (reg:DI 324)
(sign_extract:DI (ashift:DI (subreg:DI (reg:SI 232) 0)
(const_int 3 [0x3]))
(const_int 35 [0x23])
(const_int 0 [0]))) 
"/home/segher/src/kernel/kernel/cgroup/cgroup.c":1747:3 -1
 (nil))

LRA created that as a reload for

(insn 347 819 348 21 (parallel [
(set (mem/f:DI (reg:DI 324) [233 *__p_84+0 S8 A64])
(asm_operands/v:DI ("stlr %1, %0") ("=Q") 0 [
(reg:DI 325 [orig:106 prephitmp_18 ] [106])
]
 [
(asm_input:DI ("r") 
/home/segher/src/kernel/kernel/cgroup/cgroup.c:1747)
]
 [] /home/segher/src/kernel/kernel/cgroup/cgroup.c:1747))
(clobber (mem:BLK (scratch) [0  A8]))
]) "/home/segher/src/kernel/kernel/cgroup/cgroup.c":1747:3 -1
 (expr_list:REG_DEAD (reg:SI 232)
(expr_list:REG_DEAD (reg:DI 106 [ prephitmp_18 ])
(nil

as

  347: {[r324:DI]=asm_operands;clobber [scratch];}
  REG_DEAD r232:SI
  REG_DEAD r106:DI
Inserting insn reload before:
  817: r324:DI=sign_extract(r232:SI#0<<0x3,0x23,0)
  818: r324:DI=r324:DI+r284:DI
  819: r325:DI=r106:DI

(and then it died).


Can you fix this first?  There probably is something target-specific
wrong related to zero_extract.


Segher


Re: [RS6000] rs6000_rtx_costs cost IOR

2020-09-21 Thread Alan Modra via Gcc-patches
On Mon, Sep 21, 2020 at 10:49:17AM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Sep 17, 2020 at 01:12:19PM +0930, Alan Modra wrote:
> > On Wed, Sep 16, 2020 at 07:02:06PM -0500, Segher Boessenkool wrote:
> > > > + /* Test both regs even though the one in the mask is
> > > > +constrained to be equal to the output.  Increasing
> > > > +cost may well result in rejecting an invalid insn
> > > > +earlier.  */
> > > 
> > > Is that ever actually useful?
> > 
> > Possibly not in this particular case, but I did see cases where
> > invalid insns were rejected early by costing non-reg sub-expressions.
> 
> But does that ever change generated code?
> 
> This makes the compiler a lot harder to read and understand.  To the
> point that such micro-optimisations makes worthwhile optimisations hard
> or impossible to do.

Fair enough, here's a revised patch.

* config/rs6000/rs6000.c (rotate_insert_cost): New function.
(rs6000_rtx_costs): Cost IOR.  Tidy break/return.  Tidy AND.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5025e3c30c0..78c33cc8cba 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21118,6 +21118,91 @@ rs6000_cannot_copy_insn_p (rtx_insn *insn)
 && get_attr_cannot_copy (insn);
 }
 
+/* Handle rtx_costs for scalar integer rotate and insert insns.  */
+
+static bool
+rotate_insert_cost (rtx left, rtx right, machine_mode mode, bool speed,
+   int *total)
+{
+  if (GET_CODE (right) == AND
+  && CONST_INT_P (XEXP (right, 1))
+  && UINTVAL (XEXP (left, 1)) + UINTVAL (XEXP (right, 1)) + 1 == 0)
+{
+  rtx leftop = XEXP (left, 0);
+  rtx rightop = XEXP (right, 0);
+
+  /* rotlsi3_insert_5.  */
+  if (REG_P (leftop)
+ && REG_P (rightop)
+ && mode == SImode
+ && UINTVAL (XEXP (left, 1)) != 0
+ && UINTVAL (XEXP (right, 1)) != 0
+ && rs6000_is_valid_mask (XEXP (left, 1), NULL, NULL, mode))
+   return true;
+  /* rotldi3_insert_6.  */
+  if (REG_P (leftop)
+ && REG_P (rightop)
+ && mode == DImode
+ && exact_log2 (-UINTVAL (XEXP (left, 1))) > 0)
+   return true;
+  /* rotldi3_insert_7.  */
+  if (REG_P (leftop)
+ && REG_P (rightop)
+ && mode == DImode
+ && exact_log2 (-UINTVAL (XEXP (right, 1))) > 0)
+   return true;
+
+  rtx mask = 0;
+  rtx shift = leftop;
+  rtx_code shift_code = GET_CODE (shift);
+  /* rotl3_insert.  */
+  if (shift_code == ROTATE
+ || shift_code == ASHIFT
+ || shift_code == LSHIFTRT)
+   mask = right;
+  else
+   {
+ shift = rightop;
+ shift_code = GET_CODE (shift);
+ /* rotl3_insert_2.  */
+ if (shift_code == ROTATE
+ || shift_code == ASHIFT
+ || shift_code == LSHIFTRT)
+   mask = left;
+   }
+  if (mask
+ && CONST_INT_P (XEXP (shift, 1))
+ && rs6000_is_valid_insert_mask (XEXP (mask, 1), shift, mode))
+   {
+ *total += rtx_cost (XEXP (shift, 0), mode, shift_code, 0, speed);
+ *total += rtx_cost (XEXP (mask, 0), mode, AND, 0, speed);
+ return true;
+   }
+}
+  /* rotl3_insert_3.  */
+  if (GET_CODE (right) == ASHIFT
+  && CONST_INT_P (XEXP (right, 1))
+  && (INTVAL (XEXP (right, 1))
+ == exact_log2 (UINTVAL (XEXP (left, 1)) + 1)))
+{
+  *total += rtx_cost (XEXP (left, 0), mode, AND, 0, speed);
+  *total += rtx_cost (XEXP (right, 0), mode, ASHIFT, 0, speed);
+  return true;
+}
+  /* rotl3_insert_4.  */
+  if (GET_CODE (right) == LSHIFTRT
+  && CONST_INT_P (XEXP (right, 1))
+  && mode == SImode
+  && (INTVAL (XEXP (right, 1))
+ + exact_log2 (-UINTVAL (XEXP (left, 1 == 32)
+{
+  *total += rtx_cost (XEXP (left, 0), mode, AND, 0, speed);
+  *total += rtx_cost (XEXP (right, 0), mode, LSHIFTRT, 0, speed);
+  return true;
+}
+  return false;
+}
+
 /* Compute a (partial) cost for rtx X.  Return true if the complete
cost has been computed, and false if subexpressions should be
scanned.  In either case, *TOTAL contains the cost result.
@@ -21165,6 +21250,7 @@ static bool
 rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
  int opno ATTRIBUTE_UNUSED, int *total, bool speed)
 {
+  rtx left, right;
   int code = GET_CODE (x);
 
   switch (code)
@@ -21295,7 +21381,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
*total = rs6000_cost->fp;
   else
*total = rs6000_cost->dmul;
-  break;
+  return false;
 
 case DIV:
 case MOD:
@@ -21355,32 +21441,37 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
   return false;
 
 case AND:
-  if (CONST_INT_P (XEXP (x, 1)))
+  *total = COSTS_N_INSNS (1);
+  right = XEXP (x, 1);
+  if (CONST_INT

[PATCH 3/5] Add TI to TD (128-bit DFP) and TD to TI support

2020-09-21 Thread Carl Love via Gcc-patches
Segher, Will:

Add support for converting to/from 128-bit integers and 128-bit 
decimal floating point formats.

The updates from the previous version of the patch:

Removed stray ";; carll" comment.  

Removed #if 1 and #endif in the test case.
 
Replaced TARGET_TI_VECTOR_OPS with POWER10.

The patch has been tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regression errors.

The P10 test was run by hand on Mambo.


 Carl Love


---

gcc/ChangeLog

2020-09-21  Carl Love  
* config/rs6000/dfp.md (floattitd2, fixtdti2): New define_insns.

gcc/testsuite/ChangeLog

2020-09-21  Carl Love  
* gcc.target/powerpc/int_128bit-runnable.c:  Update test.
---
 gcc/config/rs6000/dfp.md  | 14 +
 gcc/config/rs6000/rs6000-call.c   |  4 ++
 .../gcc.target/powerpc/int_128bit-runnable.c  | 62 +++
 3 files changed, 80 insertions(+)

diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
index 8f822732bac..0e82e315fee 100644
--- a/gcc/config/rs6000/dfp.md
+++ b/gcc/config/rs6000/dfp.md
@@ -222,6 +222,13 @@
   "dcffixq %0,%1"
   [(set_attr "type" "dfp")])
 
+(define_insn "floattitd2"
+  [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
+   (float:TD (match_operand:TI 1 "gpc_reg_operand" "v")))]
+  "TARGET_POWER10"
+  "dcffixqq %0,%1"
+  [(set_attr "type" "dfp")])
+
 ;; Convert a decimal64/128 to a decimal64/128 whose value is an integer.
 ;; This is the first stage of converting it to an integer type.
 
@@ -241,6 +248,13 @@
   "TARGET_DFP"
   "dctfix %0,%1"
   [(set_attr "type" "dfp")])
+
+(define_insn "fixtdti2"
+  [(set (match_operand:TI 0 "gpc_reg_operand" "=v")
+   (fix:TI (match_operand:TD 1 "gpc_reg_operand" "d")))]
+  "TARGET_POWER10"
+  "dctfixqq %0,%1"
+  [(set_attr "type" "dfp")])
 
 ;; Decimal builtin support
 
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index e1d9c2e8729..9c50cd3c5a7 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -4967,6 +4967,8 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_bool_V2DI, 0 },
   { P9V_BUILTIN_VEC_VCMPNE_P, P10V_BUILTIN_VCMPNET_P,
 RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
+  { P9V_BUILTIN_VEC_VCMPNE_P, P10V_BUILTIN_VCMPNET_P,
+RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
 
   { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEFP_P,
 RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
@@ -5074,6 +5076,8 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_bool_V2DI, 0 },
   { P9V_BUILTIN_VEC_VCMPAE_P, P10V_BUILTIN_VCMPAET_P,
 RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
+  { P9V_BUILTIN_VEC_VCMPAE_P, P10V_BUILTIN_VCMPAET_P,
+RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
   { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEFP_P,
 RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
   { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEDP_P,
diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
index 85ad544e22b..ec3dcf3dff1 100644
--- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
@@ -38,6 +38,7 @@
 #if DEBUG
 #include 
 #include 
+#include 
 
 
 void print_i128(__int128_t val)
@@ -59,6 +60,13 @@ int main ()
   __int128_t arg1, result;
   __uint128_t uarg2;
 
+  _Decimal128 arg1_dfp128, result_dfp128, expected_result_dfp128;
+
+  struct conv_t {
+__uint128_t u128;
+_Decimal128 d128;
+  } conv, conv2;
+
   vector signed long long int vec_arg1_di, vec_arg2_di;
   vector unsigned long long int vec_uarg1_di, vec_uarg2_di, vec_uarg3_di;
   vector unsigned long long int vec_uresult_di;
@@ -2249,6 +2257,60 @@ int main ()
 abort();
 #endif
   }
+  
+  /* DFP to __int128 and __int128 to DFP conversions */
+  /* Can't get printing of DFP values to work.  Print the DFP value as an
+ unsigned int so we can see the bit patterns.  */
+  conv.u128 = 0x2208ULL;
+  conv.u128 = (conv.u128 << 64) | 0x4ULL;   //DFP bit pattern for integer 4
+  expected_result_dfp128 = conv.d128;
 
+  arg1 = 4;
+
+  conv.d128 = (_Decimal128) arg1;
+
+  result_dfp128 = (_Decimal128) arg1;
+  if (((conv.u128 >>64) != 0x2208ULL) &&
+  ((conv.u128 & 0x) != 0x4ULL)) {
+#if DEBUG
+printf("ERROR:  convert int128 value ");
+print_i128 (arg1);
+conv.d128 = result_dfp128;
+printf("\nto DFP value 0x%llx %llx (printed as hex bit string) ",
+  (unsigned long long)((conv.u128) >>64),
+  (unsigned long long)((conv.u128) & 0x));
+
+conv.d128 = expected_result_dfp128;
+printf("\ndoes not match expected_result = 0x%llx %llx\n\n",
+  (unsigned long long)

[PATCH 1/5] RS6000 Add 128-bit Binary Integer sign extend operations

2020-09-21 Thread Carl Love via Gcc-patches
Segher, Will:

Patch 1, adds the 128-bit sign extension instruction support and
corresponding builtin support.

No changes from the previous version.

The patch has been tested on 

  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regression errors.

Fixed the issues in the ChangeLog noted by Will.

 Carl Love

---

gcc/ChangeLog

2020-09-21  Carl Love  
* config/rs6000/altivec.h (vec_signextll, vec_signexti): Add define
for new builtins.
* config/rs6000/rs6000-builtin.def (VSIGNEXTI, VSIGNEXTLL):  Add
overloaded builtin definitions.
(VSIGNEXTSB2W, VSIGNEXTSB2D, VSIGNEXTSH2D,VSIGNEXTSW2D): Add builtin
expansions.
* config/rs6000-call.c (P9V_BUILTIN_VEC_VSIGNEXTI,
P9V_BUILTIN_VEC_VSIGNEXTLL): Add overloaded argument definitions.
* config/rs6000/vsx.md: Make define_insn vsx_sign_extend_si_v2di
visible.
* doc/extend.texi:  Add documentation for the vec_signexti and
vec_signextll builtins.

gcc/testsuite/ChangeLog

2020-09-21  Carl Love  
* gcc.target/powerpc/p9-sign_extend-runnable.c:  New test case.
---
 gcc/config/rs6000/altivec.h   |   3 +
 gcc/config/rs6000/rs6000-builtin.def  |   9 ++
 gcc/config/rs6000/rs6000-call.c   |  13 ++
 gcc/config/rs6000/vsx.md  |   2 +-
 gcc/doc/extend.texi   |  15 ++
 .../powerpc/p9-sign_extend-runnable.c | 128 ++
 6 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 8a2dcda0144..acc365612be 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -494,6 +494,9 @@
 
 #define vec_xlx __builtin_vec_vextulx
 #define vec_xrx __builtin_vec_vexturx
+#define vec_signexti  __builtin_vec_vsignexti
+#define vec_signextll __builtin_vec_vsignextll
+
 #endif
 
 /* Predicates.
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index e91a48ddf5f..4c2e9460949 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2715,6 +2715,8 @@ BU_P9V_OVERLOAD_1 (VPRTYBD,   "vprtybd")
 BU_P9V_OVERLOAD_1 (VPRTYBQ,"vprtybq")
 BU_P9V_OVERLOAD_1 (VPRTYBW,"vprtybw")
 BU_P9V_OVERLOAD_1 (VPARITY_LSBB,   "vparity_lsbb")
+BU_P9V_OVERLOAD_1 (VSIGNEXTI,  "vsignexti")
+BU_P9V_OVERLOAD_1 (VSIGNEXTLL, "vsignextll")
 
 /* 2 argument functions added in ISA 3.0 (power9).  */
 BU_P9_2 (CMPRB,"byte_in_range",CONST,  cmprb)
@@ -2726,6 +2728,13 @@ BU_P9_OVERLOAD_2 (CMPRB, "byte_in_range")
 BU_P9_OVERLOAD_2 (CMPRB2,  "byte_in_either_range")
 BU_P9_OVERLOAD_2 (CMPEQB,  "byte_in_set")
 
+/* Sign extend builtins that work on ISA 3.0, but not defined until ISA 3.1.  
*/
+BU_P9V_AV_1 (VSIGNEXTSB2W, "vsignextsb2w", CONST,  
vsx_sign_extend_qi_v4si)
+BU_P9V_AV_1 (VSIGNEXTSH2W, "vsignextsh2w", CONST,  
vsx_sign_extend_hi_v4si)
+BU_P9V_AV_1 (VSIGNEXTSB2D, "vsignextsb2d", CONST,  
vsx_sign_extend_qi_v2di)
+BU_P9V_AV_1 (VSIGNEXTSH2D, "vsignextsh2d", CONST,  
vsx_sign_extend_hi_v2di)
+BU_P9V_AV_1 (VSIGNEXTSW2D, "vsignextsw2d", CONST,  
vsx_sign_extend_si_v2di)
+
 /* Builtins for scalar instructions added in ISA 3.1 (power10).  */
 BU_P10_MISC_2 (CFUGED, "cfuged", CONST, cfuged)
 BU_P10_MISC_2 (CNTLZDM, "cntlzdm", CONST, cntlzdm)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index a8b520834c7..9e514a01012 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -5527,6 +5527,19 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
 RS6000_BTI_INTSI, RS6000_BTI_INTSI },
 
+  /* Sign extend builtins that work work on ISA 3.0, not added until ISA 3.1 */
+  { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSB2W,
+RS6000_BTI_V4SI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSH2W,
+RS6000_BTI_V4SI, RS6000_BTI_V8HI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSB2D,
+RS6000_BTI_V2DI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSH2D,
+RS6000_BTI_V2DI, RS6000_BTI_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSW2D,
+RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
+
   /* Overloaded built-in functions for ISA3.1 (power10). */
   { P10_BUILTIN_VEC_CLRL, P10V_BUILTIN_VCLRLB,
 RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 },
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 4ff52455fd3..31fcffe8f33 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4787,7 +4787,7 @@
   "vextsh2 %0,%1"
   [(set_attr "type" "vecexts")])
 
-

  1   2   >