Re: [PATCH] ipa-sra: Do not remove return values needed because of non-call EH (PR 98690)

2021-01-19 Thread Richard Biener via Gcc-patches
On Mon, Jan 18, 2021 at 8:27 PM Martin Jambor  wrote:
>
> Hi,
>
> IPA-SRA already contains a check to figure out that an otherwise dead
> parameter is actually required because of non-call exceptions, but it
> is not present at the equivalent spot where SRA figures out whether
> the return statement is used for anything useful.  This patch adds
> that condition there.
>
> Unfortunately, even though this patch should be good enough for any
> normal (I'd even say reasonable) use of the compiler, it hints that
> when the user manually switches all sorts of DCE, IPA-SRA would
> probably leave behind problematic statements manipulating what
> originally were return values, just like it does for parameters (PR
> 93385).  Fixing this properly might unfortunately be a separate issue
> from the mentioned bug because the LHS of a call is changed during
> call redirection and the caller often is not a clone.  But I'll see
> what I can do.
>
> Meanwhile, the patch below has been bootstrapped and tested on x86_64.
> OK for trunk and then for the gcc-10 branch?

OK.

Thanks,
Richard.

> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
> 2021-01-18  Martin Jambor  
>
> PR ipa/98690
> * ipa-sra.c (ssa_name_only_returned_p): New parameter fun.  Check
> whether non-call exceptions allow removal of a statement.
> (isra_analyze_call): Pass the appropriate function to
> ssa_name_only_returned_p.
>
> gcc/testsuite/ChangeLog:
>
> 2021-01-18  Martin Jambor  
>
> PR ipa/98690
> * g++.dg/ipa/pr98690.C: New test.
> ---
>  gcc/ipa-sra.c  | 20 +++-
>  gcc/testsuite/g++.dg/ipa/pr98690.C | 27 +++
>  2 files changed, 38 insertions(+), 9 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr98690.C
>
> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
> index 5d2c0dfce53..1571921cb48 100644
> --- a/gcc/ipa-sra.c
> +++ b/gcc/ipa-sra.c
> @@ -1952,13 +1952,13 @@ scan_function (cgraph_node *node, struct function 
> *fun)
>  }
>  }
>
> -/* Return true if SSA_NAME NAME is only used in return statements, or if
> -   results of any operations it is involved in are only used in return
> -   statements.  ANALYZED is a bitmap that tracks which SSA names we have
> -   already started investigating.  */
> +/* Return true if SSA_NAME NAME of function described by FUN is only used in
> +   return statements, or if results of any operations it is involved in are
> +   only used in return statements.  ANALYZED is a bitmap that tracks which 
> SSA
> +   names we have already started investigating.  */
>
>  static bool
> -ssa_name_only_returned_p (tree name, bitmap analyzed)
> +ssa_name_only_returned_p (function *fun, tree name, bitmap analyzed)
>  {
>bool res = true;
>imm_use_iterator imm_iter;
> @@ -1978,8 +1978,9 @@ ssa_name_only_returned_p (tree name, bitmap analyzed)
>   break;
> }
> }
> -  else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt))
> -  || gimple_code (stmt) == GIMPLE_PHI)
> +  else if (!stmt_unremovable_because_of_non_call_eh_p (fun, stmt)
> +  && ((is_gimple_assign (stmt) && !gimple_has_volatile_ops 
> (stmt))
> +  || gimple_code (stmt) == GIMPLE_PHI))
> {
>   /* TODO: And perhaps for const function calls too?  */
>   tree lhs;
> @@ -1995,7 +1996,7 @@ ssa_name_only_returned_p (tree name, bitmap analyzed)
> }
>   gcc_assert (!gimple_vdef (stmt));
>   if (bitmap_set_bit (analyzed, SSA_NAME_VERSION (lhs))
> - && !ssa_name_only_returned_p (lhs, analyzed))
> + && !ssa_name_only_returned_p (fun, lhs, analyzed))
> {
>   res = false;
>   break;
> @@ -2049,7 +2050,8 @@ isra_analyze_call (cgraph_edge *cs)
>if (TREE_CODE (lhs) == SSA_NAME)
> {
>   bitmap analyzed = BITMAP_ALLOC (NULL);
> - if (ssa_name_only_returned_p (lhs, analyzed))
> + if (ssa_name_only_returned_p (DECL_STRUCT_FUNCTION 
> (cs->caller->decl),
> +   lhs, analyzed))
> csum->m_return_returned = true;
>   BITMAP_FREE (analyzed);
> }
> diff --git a/gcc/testsuite/g++.dg/ipa/pr98690.C 
> b/gcc/testsuite/g++.dg/ipa/pr98690.C
> new file mode 100644
> index 000..004418e5b40
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ipa/pr98690.C
> @@ -0,0 +1,27 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fnon-call-exceptions" } */
> +
> +int g;
> +volatile int v;
> +
> +static int * __attribute__((noinline))
> +almost_useless_return (void)
> +{
> +  v = 1;
> +  return &g;
> +}
> +
> +static void __attribute__((noinline))
> +foo (void)
> +{
> +  int *p = almost_useless_return ();
> +  int i = *p;
> +  v = 2;
> +}
> +
> +int
> +main (int argc, char *argv[])
> +{
> +  foo ();
> +  return 0;
> +}
> --
> 2.29.2
>


Re: [PATCH] fwprop: Allow (subreg (mem)) simplifications

2021-01-19 Thread Richard Biener via Gcc-patches
On Mon, Jan 18, 2021 at 11:04 PM Ilya Leoshkevich via Gcc-patches
 wrote:
>
> Boostrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
> and s390x-redhat-linux.  I realize it might be too late for a change
> like this, but it's desirable to have this in conjunction with the
> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563799.html s390
> regression fix, which otherwise produces unnecessary store/load
> sequences in certain glibc routines, e.g. __ieee754_sqrtl.  Ok for
> master?
>
>
>
> Suppose we have:
>
> (set (reg/v:TF 63) (mem/c:TF (reg/v:DI 62)))
> (set (reg:FPRX2 66) (subreg:FPRX2 (reg/v:TF 63) 0))
>
> It is clearly profitable to propagate the first insn into the second
> one and get:
>
> (set (reg:FPRX2 66) (mem/c:FPRX2 (reg/v:DI 62)))
>
> fwprop actually manages to perform this, but doesn't think the result is
> worth it, which results in unnecessary store/load sequences on s390.
> Improve the situation by classifying SUBREG -> MEM changes as
> profitable.

IIRC fwprop also propagates into multiple uses and replacing a non-MEM
with a MEM is only good when the original MEM goes away - is that properly
dealt with here?

Richard.

> gcc/ChangeLog:
>
> 2021-01-15  Ilya Leoshkevich  
>
> * fwprop.c (fwprop_propagation::classify_result): Allow
> (subreg (mem)) simplifications.
>
> gcc/testsuite/ChangeLog:
>
> 2021-01-15  Ilya Leoshkevich  
>
> * gcc.target/s390/vector/long-double-to-i64.c: Expect that
> float-vector moves do *not* happen.
> ---
>  gcc/fwprop.c  | 5 +
>  gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c | 3 +--
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
> index eff8f7cc141..46b8ec7eccf 100644
> --- a/gcc/fwprop.c
> +++ b/gcc/fwprop.c
> @@ -262,6 +262,11 @@ fwprop_propagation::classify_result (rtx old_rtx, rtx 
> new_rtx)
>&& GET_MODE (new_rtx) == GET_MODE_INNER (GET_MODE (from)))
>  return PROFITABLE;
>
> +  /* Allow (subreg (mem)) -> (mem) simplifications.  However, do not allow
> + creating new (mem/v)s, since DCE will not remove the old ones.  */
> +  if (SUBREG_P (old_rtx) && MEM_P (new_rtx) && !MEM_VOLATILE_P (new_rtx))
> +return PROFITABLE;
> +
>return 0;
>  }
>
> diff --git a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c 
> b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
> index 2dbbb5d1c03..8f4e377ed72 100644
> --- a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
> +++ b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
> @@ -10,8 +10,7 @@ long_double_to_i64 (long double x)
>return x;
>  }
>
> -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,1\n} 1 } } 
> */
> -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,5\n} 1 } } 
> */
> +/* { dg-final { scan-assembler-not {\n\tvpdi\t} } } */
>  /* { dg-final { scan-assembler-times {\n\tcgxbr\t} 1 } } */
>
>  int
> --
> 2.26.2
>


Re: [PATCH, v2, OpenMP 5.0, libgomp] Structure element mapping for OpenMP 5.0

2021-01-19 Thread Chung-Lin Tang




On 2021/1/16 5:45 下午, Jakub Jelinek wrote:

+/* Unified reference count for structure element siblings, this is used
+   when REFCOUNT_STRUCTELEM_FIRST_P(k->refcount) == true, the first sibling
+   in a structure element sibling list item sequence.  */
+uintptr_t structelem_refcount;
+
+/* When REFCOUNT_STRUCTELEM_P (k->refcount) == true, this field points


REFCOUNT_STRUCTELEM_P (k->refcount) is true even for
REFCOUNT_STRUCTELEM_FIRST_P(k->refcount), so shouldn't the description say
that structelem_refcount_ptr is only used if
REFCOUNT_STRUCTELEM_P (k->refcount) && !REFCOUNT_STRUCTELEM_FIRST_P 
(k->refcount)
?


Sure, I'll revise the comments a bit.


+   into the (above) structelem_refcount field of the _FIRST splay_tree_key,
+   the first key in the created sequence. All structure element siblings
+   share a single refcount in this manner. Since these two fields won't be
+   used at the same time, they are stashed in a union.  */
+uintptr_t *structelem_refcount_ptr;
+  };
struct splay_tree_aux *aux;
  };
  
  /* The comparison function.  */


Anyway, most of the patch looks good, but I'd like to understand the
rationale for choosing a htab over what I've been trying to suggest, which
was essentially instead of incrementing or decrementing refcounts push them
into a vector for later incrementing/decrementing, then qsort the vector
(by the pointers to refcounts) and increment what the elements point to unless
the same address has been incremented/decremented already.

Jakub


Essentially the requirement is to increment/decrement a refcount only once per 
construct,
so using a pointer-set (implemented by htab_t here) to track the processing 
status
seemed to be more intuitive in code, and probably faster than sorting a vector 
I think
(at least in most cases).

Chung-Lin


Re: [committed] Skip asm goto test fails on hppa

2021-01-19 Thread Andreas Schwab
On Jan 18 2021, Hans-Peter Nilsson wrote:

> On Mon, 18 Jan 2021, John David Anglin wrote:
>> The hppa target is a reload target and asm goto is not supported on reload 
>> targets.
>> Skip failing tests on hppa.
>
> IIUC the preferred term is "IRA target" or maybe "non-LRA
> target", as opposed to "LRA target".  The tests fail for
> cris-elf too, another IRA target, so I'd like to use that term
> when adjusting the dg-skip-if, hope you don't mind.

Perhaps there should be a target-supports test that rejects all current
non-lra targets.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH, v2, OpenMP 5.0, libgomp] Structure element mapping for OpenMP 5.0

2021-01-19 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 19, 2021 at 04:46:36PM +0800, Chung-Lin Tang wrote:
> > > +   into the (above) structelem_refcount field of the _FIRST 
> > > splay_tree_key,
> > > +   the first key in the created sequence. All structure element 
> > > siblings
> > > +   share a single refcount in this manner. Since these two fields 
> > > won't be
> > > +   used at the same time, they are stashed in a union.  */
> > > +uintptr_t *structelem_refcount_ptr;
> > > +  };
> > > struct splay_tree_aux *aux;
> > >   };
> > >   /* The comparison function.  */
> > 
> > Anyway, most of the patch looks good, but I'd like to understand the
> > rationale for choosing a htab over what I've been trying to suggest, which
> > was essentially instead of incrementing or decrementing refcounts push them
> > into a vector for later incrementing/decrementing, then qsort the vector
> > (by the pointers to refcounts) and increment what the elements point to 
> > unless
> > the same address has been incremented/decremented already.
> > 
> > Jakub
> 
> Essentially the requirement is to increment/decrement a refcount only once 
> per construct,
> so using a pointer-set (implemented by htab_t here) to track the processing 
> status
> seemed to be more intuitive in code, and probably faster than sorting a 
> vector I think
> (at least in most cases).

I agree about the more intuitive, but think it will be actually slower, and
performance is what we care about most here, the mapping is already too
slow.
The common case is only a few mappings and no repeated mappings (e.g. the
compiler ought to help there and just remove mappings that are provably
duplicate if possible).  E.g. with one mapping, no qsort is needed at all,
and generally should be O(n log n).  The hash set needs larger memory
allocation than the vector and needs it cleared, plus it is a hash table
without chains, so there is some cost on collisions and if ever the hash
table needs to be expanded.  But I'll be happy to be proven wrong.

Jakub



Re: [committed] Skip asm goto test fails on hppa

2021-01-19 Thread Jakub Jelinek via Gcc-patches
On Mon, Jan 18, 2021 at 11:50:56PM -0500, Hans-Peter Nilsson wrote:
> On Mon, 18 Jan 2021, John David Anglin wrote:
> > The hppa target is a reload target and asm goto is not supported on reload 
> > targets.
> > Skip failing tests on hppa.
> 
> IIUC the preferred term is "IRA target" or maybe "non-LRA
> target", as opposed to "LRA target".  The tests fail for
> cris-elf too, another IRA target, so I'd like to use that term
> when adjusting the dg-skip-if, hope you don't mind.
> 
> But also, I'd like to xfail it instead for cris-elf, which adds
> a caveat: people might then think a "reload target" is not the
> same as an "IRA target", what with the different adjustments.

I think "IRA target" is not the right term, all targets are IRA targets,
but only some are using LRA and others use the old reload.

Jakub



[PATCH] sparc,rtems: add __FIX_LEON3FT_TN0018 for affected targets

2021-01-19 Thread Daniel Hellstrom
Enable a define FIX_LEON3FT_TN0018 for the LEON3FT targets affected
by the GRLIB-TN-0018 errata described here:
  https://www.gaisler.com/notes


gcc/

* config/sparch/rtemself.h (TARGET_OS_CPP_BUILTINS): Add built-in define 
__FIX_LEON3FT_TN0018. 

---
 gcc/config/sparc/rtemself.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/sparc/rtemself.h b/gcc/config/sparc/rtemself.h
index 6570590..ddec98c 100644
--- a/gcc/config/sparc/rtemself.h
+++ b/gcc/config/sparc/rtemself.h
@@ -33,6 +33,8 @@
builtin_assert ("system=rtems");\
if (sparc_fix_b2bst)\
  builtin_define ("__FIX_LEON3FT_B2BST"); \
+   if (sparc_fix_gr712rc || sparc_fix_ut700 || sparc_fix_ut699) \
+ builtin_define ("__FIX_LEON3FT_TN0018"); \
 }  \
   while (0)
 
-- 
2.7.4



Fix PR ada/98740

2021-01-19 Thread Eric Botcazou
It's a long-standing GENERIC tree sharing issue.

Tested on x86_64-suse-linux, applied on the mainline, 10 and 9 branches.


2021-01-19  Eric Botcazou  

PR ada/98740
* gcc-interface/trans.c (add_decl_expr): Always mark TYPE_ADA_SIZE.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index 4ab26d3e2dd..6402c73ded0 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -8479,15 +8479,16 @@ add_decl_expr (tree gnu_decl, Node_Id gnat_node)
 	  MARK_VISITED (DECL_SIZE_UNIT (gnu_decl));
 	  MARK_VISITED (DECL_INITIAL (gnu_decl));
 	}
-  /* In any case, we have to deal with our own TYPE_ADA_SIZE field.  */
-  else if (TREE_CODE (gnu_decl) == TYPE_DECL
-	   && RECORD_OR_UNION_TYPE_P (type)
-	   && !TYPE_FAT_POINTER_P (type))
-	MARK_VISITED (TYPE_ADA_SIZE (type));
 }
   else
 add_stmt_with_node (gnu_stmt, gnat_node);
 
+  /* Mark our TYPE_ADA_SIZE field now since it will not be gimplified.  */
+  if (TREE_CODE (gnu_decl) == TYPE_DECL
+  && RECORD_OR_UNION_TYPE_P (type)
+  && !TYPE_FAT_POINTER_P (type))
+MARK_VISITED (TYPE_ADA_SIZE (type));
+
   /* If this is a variable and an initializer is attached to it, it must be
  valid for the context.  Similar to init_const in create_var_decl.  */
   if (TREE_CODE (gnu_decl) == VAR_DECL


Re: [PATCH] sparc, rtems: add __FIX_LEON3FT_TN0018 for affected targets

2021-01-19 Thread Eric Botcazou
> * config/sparch/rtemself.h (TARGET_OS_CPP_BUILTINS): Add built-in define 
> __FIX_LEON3FT_TN0018. 

OK for whichever branch(es) you deem appropriate.

-- 
Eric Botcazou




Re: [Patch] OpenMP/Fortran: Fixes for {use,is}_device_ptr [PR98476]

2021-01-19 Thread Jakub Jelinek via Gcc-patches
On Mon, Jan 18, 2021 at 05:56:21PM +0100, Tobias Burnus wrote:
> OpenMP/Fortran: Fixes for {use,is}_device_ptr
> 
> gcc/fortran/ChangeLog:
> 
>   PR fortran/98476
>   * openmp.c (resolve_omp_clauses): Change use_device_ptr
>   to use_device_addr for unless type(c_ptr); check all
>   list item for is_device_ptr.
> 
> gcc/ChangeLog:
> 
>   PR fortran/98476
>   * omp-low.c (lower_omp_target): Handle nonpointer is_device_ptr.
> 
> libgomp/ChangeLog:
> 
>   PR fortran/98476
>   * testsuite/libgomp.fortran/is_device_ptr-1.f90: New test.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR fortran/98476
>   * gfortran.dg/gomp/map-3.f90: Update expected scan-dump-tree.
>   * gfortran.dg/gomp/is_device_ptr-1.f90: New test.
>   * gfortran.dg/gomp/use_device_ptr-2.f90: New test.

Ok, thanks.

Jakub



Re: [PATCH] sparc,rtems: add __FIX_LEON3FT_TN0018 for affected targets

2021-01-19 Thread Daniel Hellstrom
  


On 2021-01-19 10:52, Eric Botcazou wrote:

 * config/sparch/rtemself.h (TARGET_OS_CPP_BUILTINS): Add built-in define 
__FIX_LEON3FT_TN0018.

OK for whichever branch(es) you deem appropriate.


GCC-10, GCC-11 and master.

Thanks,
Daniel



Re: [PATCH 1/4] Remove build dependence on HSA run-time

2021-01-19 Thread Martin Jambor
Hi Thomas,

On Thu, Jan 14 2021, Thomas Schwinge wrote:
> Hi!
>
> I'm raising here an issue with HSA libgomp plugin code changes from a
> while ago.  While HSA is now no longer relevant for GCC master branch,
> the same code has also been copied into the GCN libgomp plugin.
>
> This is commit b8d89b03db5f212919e4571671ebb4f5f8b1e19d (r242749) "Remove
> build dependence on HSA run-time":
>
> On 2016-11-22T14:27:44+0100, Martin Jambor  wrote:
>> --- a/libgomp/plugin/configfrag.ac
>> +++ b/libgomp/plugin/configfrag.ac
>
>> @@ -195,8 +183,8 @@ if test x"$enable_offload_targets" != x; then
>>   tgt_name=hsa
>>   PLUGIN_HSA=$tgt
>>   PLUGIN_HSA_CPPFLAGS=$HSA_RUNTIME_CPPFLAGS
>> - PLUGIN_HSA_LDFLAGS="$HSA_RUNTIME_LDFLAGS $HSA_KMT_LDFLAGS"
>> - PLUGIN_HSA_LIBS="-lhsa-runtime64 -lhsakmt"
>> + PLUGIN_HSA_LDFLAGS="$HSA_RUNTIME_LDFLAGS"
>> + PLUGIN_HSA_LIBS="-ldl"
>
> So this switched from directly linking against 'libhsa-runtime64.so' to a
> 'libdl'-based runtime linking variant.
>
> Previously, 'libhsa-runtime64.so' would've been found at run time via the
> standard search paths.
>
>> +if test "$HSA_RUNTIME_LIB" != ""; then
>> +  HSA_RUNTIME_LIB="$HSA_RUNTIME_LIB/"
>> +fi
>> +
>> +AC_DEFINE_UNQUOTED([HSA_RUNTIME_LIB], ["$HSA_RUNTIME_LIB"],
>> +  [Define path to HSA runtime.])
>
> That's new, to propagate '--with-hsa-runtime'/'--with-hsa-runtime-lib'
> into the HSA plugin source code.
>
>> --- a/libgomp/plugin/plugin-hsa.c
>> +++ b/libgomp/plugin/plugin-hsa.c
>
>> +static const char *hsa_runtime_lib;
>
>>  static void
>>  init_enviroment_variables (void)
>>  {
>
>> +  hsa_runtime_lib = secure_getenv ("HSA_RUNTIME_LIB");
>
> Unless overridden via the 'HSA_RUNTIME_LIB' environment variable...
>
>> +  if (hsa_runtime_lib == NULL)
>> +hsa_runtime_lib = HSA_RUNTIME_LIB "libhsa-runtime64.so";
>
> ... we now default to '[HSA_RUNTIME_LIB]/libhsa-runtime64.so' (note
> 'HSA_RUNTIME_LIB' prefix!)...
>
>> +static bool
>> +init_hsa_runtime_functions (void)
>> +{
>> +  void *handle = dlopen (hsa_runtime_lib, RTLD_LAZY);
>
> ..., which is then 'dlopen'ed here.
>
> That means, contrary to before, the GCC configure-time
> '--with-hsa-runtime' (by definition only valid for GCC configure/build as
> well as build-tree testing) leaks into the installed HSA libgomp plugin.
> That's a problem if your GCC build system (and build-tree testing)
> requires '--with-hsa-runtime' to specify a non-standard location (not in
> default search paths) but that location is not valid on your GCC
> deployment system (but it has leaked into the HSA libgomp plugin),
> meaning that (unless overridden via the 'HSA_RUNTIME_LIB' environment
> variable) 'libhsa-runtime64.so' is now no longer found via the standard
> search paths, because of the 'HSA_RUNTIME_LIB' prefix passed into
> 'dlopen'.
>
> Per my understanding this cannot be intentional, so I suggest to restore
> the previous behavior as per the attached "libgomp HSA/GCN plugins:
> don't

I honestly do not remember, it is quote possible.  I'm not quite sure
what you mean by "previous behavior" (the previous behavior was static
linking, no?) though.


> prepend the 'HSA_RUNTIME_LIB' path to 'libhsa-runtime64.so'".  OK to push
> such changes?  I was tempted to push "as obvious", but maybe I fail to
> see the rationale behind this change?
>
> For avoidance of doubt, this change doesn't affect (build-tree) testsuite
> usage, where we have:
>
> libgomp/testsuite/libgomp-test-support.exp.in:set hsa_runtime_lib 
> "@HSA_RUNTIME_LIB@"
>
> libgomp/testsuite/lib/libgomp.exp:  append always_ld_library_path 
> ":$hsa_runtime_lib"
>
> And, another data point:
>
> gcc/config/gcn/gcn-run.c:#define HSA_RUNTIME_LIB "libhsa-runtime64.so.1"
> [...]
> gcc/config/gcn/gcn-run.c:  void *handle = dlopen (HSA_RUNTIME_LIB, 
> RTLD_LAZY);
>
> Here, 'libhsa-runtime64.so.1' is 'dlopen'ed without prefix, and thus
> found via the standard search paths (as expected).
>

Right.  From what I can tell at the moment, which is not much, the idea
was to be able to load it even from a non-standard path and specify that
path at configure time.  If people think that is not useful and is
actually harmful, I guess it can go.

Martin



[r11-6755 Regression] FAIL: libstdc++-prettyprinters/libfundts.cc print os on Linux/x86_64

2021-01-19 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

3804e937b0e252a7e42632fe6d9f898f1851a49c is the first bad commit
commit 3804e937b0e252a7e42632fe6d9f898f1851a49c
Author: Mark Wielaard 
Date:   Tue Sep 29 15:52:44 2020 +0200

Default to DWARF5

caused

FAIL: libstdc++-prettyprinters/80276.cc whatis p4
FAIL: libstdc++-prettyprinters/libfundts.cc print as
FAIL: libstdc++-prettyprinters/libfundts.cc print os

with GCC configured with



To reproduce:

$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="prettyprinters.exp=libstdc++-prettyprinters/80276.cc 
--target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="prettyprinters.exp=libstdc++-prettyprinters/80276.cc 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="prettyprinters.exp=libstdc++-prettyprinters/80276.cc 
--target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="prettyprinters.exp=libstdc++-prettyprinters/80276.cc 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="prettyprinters.exp=libstdc++-prettyprinters/libfundts.cc 
--target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="prettyprinters.exp=libstdc++-prettyprinters/libfundts.cc 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="prettyprinters.exp=libstdc++-prettyprinters/libfundts.cc 
--target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="prettyprinters.exp=libstdc++-prettyprinters/libfundts.cc 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[PATCH] ipa/97673 - fix input_location leak

2021-01-19 Thread Richard Biener
This fixes input_location leaking with an invalid BLOCK from
expand_call_inline to tree_function_versioning via clone
materialization.

Bootstrapped and tested on x86_64-unknown-linux-gnu (with an
extra hunk calling verify_gimple from tree-function-versioning)

Pushed.

2021-01-19  Richard Biener  

PR ipa/97673
* tree-inline.c (tree_function_versioning): Set input_location
to UNKNOWN_LOCATION throughout the function.

* gfortran.dg/pr97673.f90: New testcase.
---
 gcc/testsuite/gfortran.dg/pr97673.f90 | 14 ++
 gcc/tree-inline.c |  7 +++
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/pr97673.f90

diff --git a/gcc/testsuite/gfortran.dg/pr97673.f90 
b/gcc/testsuite/gfortran.dg/pr97673.f90
new file mode 100644
index 000..33b81435806
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr97673.f90
@@ -0,0 +1,14 @@
+! { dg-do compile }
+! { dg-options "-O3 -fno-early-inlining --param large-stack-frame=4000" }
+
+subroutine sub3noiso(a, b)
+  use iso_c_binding
+  implicit none
+  character(len=1,kind=c_char) :: a(*), b
+  character(len=1,kind=c_char):: x,z
+  integer(c_int) :: y
+  value :: b
+  print *, a(1:2), b
+entry sub3noisoEntry(x,y,z)
+  x = 'd'
+end subroutine sub3noiso
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 84f71d9c6cc..3100b845f23 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -6215,6 +6215,12 @@ tree_function_versioning (tree old_decl, tree new_decl,
   auto_vec init_stmts;
   tree vars = NULL_TREE;
 
+  /* We can get called recursively from expand_call_inline via clone
+ materialization.  While expand_call_inline maintains input_location
+ we cannot tolerate it to leak into the materialized clone.  */
+  location_t saved_location = input_location;
+  input_location = UNKNOWN_LOCATION;
+
   gcc_assert (TREE_CODE (old_decl) == FUNCTION_DECL
  && TREE_CODE (new_decl) == FUNCTION_DECL);
   DECL_POSSIBLY_INLINED (old_decl) = 1;
@@ -6517,6 +6523,7 @@ tree_function_versioning (tree old_decl, tree new_decl,
 
   gcc_assert (!id.debug_stmts.exists ());
   pop_cfun ();
+  input_location = saved_location;
   return;
 }
 
-- 
2.26.2


Re: [committed] Skip asm goto test fails on hppa

2021-01-19 Thread Hans-Peter Nilsson



On Tue, 19 Jan 2021, Jakub Jelinek wrote:

> On Mon, Jan 18, 2021 at 11:50:56PM -0500, Hans-Peter Nilsson wrote:
> > On Mon, 18 Jan 2021, John David Anglin wrote:
> > > The hppa target is a reload target and asm goto is not supported on 
> > > reload targets.
> > > Skip failing tests on hppa.
> >
> > IIUC the preferred term is "IRA target" or maybe "non-LRA
> > target", as opposed to "LRA target".  The tests fail for
> > cris-elf too, another IRA target, so I'd like to use that term
> > when adjusting the dg-skip-if, hope you don't mind.
> >
> > But also, I'd like to xfail it instead for cris-elf, which adds
> > a caveat: people might then think a "reload target" is not the
> > same as an "IRA target", what with the different adjustments.
>
> I think "IRA target" is not the right term, all targets are IRA targets,
> but only some are using LRA and others use the old reload.

But, IRA isn't used when...  Whatever; LRA and non-LRA then.
I'll call the suggested testsuite target-supports predicate
"lra" (thanks Andreas S.; of course!), to be used negated here.

brgds, H-P


Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-19 Thread Richard Sandiford via Gcc-patches
Hongtao Liu  writes:
> On Mon, Jan 18, 2021 at 7:10 PM Richard Sandiford
>  wrote:
>>
>> Hongtao Liu  writes:
>> > On Mon, Jan 18, 2021 at 6:18 PM Richard Sandiford
>> >  wrote:
>> >>
>> >> Hongtao Liu via Gcc-patches  writes:
>> >> > Hi:
>> >> >   If SRC had been assigned a mode narrower than the copy, we can't link
>> >> > DEST into the chain even they have same
>> >> > hard_regno_nregs(i.e. HImode/SImode in i386 backend).
>> >>
>> >> In general, changes between modes within the same hard register are OK.
>> >> Could you explain in more detail what's going wrong?
>> >>
>> >
>> > cprop hardreg change
>> >
>> > (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
>> > (reg:SI 37 r9 [orig:86 _11 ] [86])) "test.c":29:36 75 
>> > {*movsi_internal}
>> >  (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
>> > (nil)))
>> >
>> > to
>> >
>> > (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
>> > (reg:SI 22 xmm2 [orig:86 _11 ] [86])) "test.c":29:36 75
>> > {*movsi_internal}
>> >  (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
>> > (nil)))
>> >
>> > since (reg:SI 22 xmm2) and (reg:SI r9) are in the same value chain in
>> > which the oldest regno is k0.
>> >
>> > but with xmm2 defined as
>> >
>> > kmovw %k0, %edi  # 69 [c=4 l=4] *movhi_internal/6- kmovw move the
>> > lower 16bits to %edi, and clear the upper 16 bits.
>> > vmovd %edi, %xmm2 # 489 *movsi_internal  --- vmovd move 32bits from
>> > %edi to %xmm2.
>> >
>> > (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
>> > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
>> > {*movhi_internal}
>> >  (nil))
>> >
>> > (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
>> > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
>> >  (nil))
>>
>> The sequence is OK in itself, but insn 489 can't make any assumptions
>> about what's in the upper 16 bits of %edi.  In other words, as far as
>> RTL semantics are concerned, insn 489 only leaves bits 0-15 of %xmm2
>> with defined values; the other bits are undefined.
>>
>> If the target wants all 32 bits of %edi to be carried over to insn 489
>> then it needs to make insn 69 an SImode set instead of a HImode set.
>>
>
> actually only the lower 16bits are needed, the original insn is like
>
> .294.r.ira
> (insn 69 68 70 13 (set (reg:HI 96 [ _52 ])
> (subreg:HI (reg:DI 82 [ var_6.0_1 ]) 0)) "test.c":21:23 76
> {*movhi_internal}
>  (nil))
> (insn 78 75 82 13 (set (reg:V4HI 140 [ _283 ])
> (vec_duplicate:V4HI (truncate:HI (subreg:SI (reg:HI 96 [ _52
> ]) 0 1412 {*vec_dupv4hi}
>  (nil))
>
> .295r.reload
> (insn 69 68 70 13 (set (reg:HI 5 di [orig:96 _52 ] [96])
> (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
> {*movhi_internal}
>  (nil))
> (insn 489 75 78 13 (set (reg:SI 22 xmm2 [297])
> (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
>  (nil))
> (insn 78 489 490 13 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
> (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297]
> 1412 {*vec_dupv4hi}
>  (nil))
>
> and insn 489 is created by lra/reload which seems ok for the sequence,
> but problemistic with considering the logic of hardreg_cprop.

It looks OK even with the regcprop behaviour though:

- insn 69 defines only the low 16 bits of di,
- insn 489 defines only the low 16 bits of xmm2, but copies bits 16-31
  too (with unknown contents)
- insn 78 uses only the low 16 bits of xmm2 (the unknown contents
  introduced by insn 489 are truncated away)

So where do bits 16-31 become significant?  What goes wrong if they're
not zero?

Thanks,
Richard


Re: [PATCH 1/4] Remove build dependence on HSA run-time

2021-01-19 Thread Martin Liška

On 1/19/21 12:37 PM, Martin Jambor wrote:

Right.  From what I can tell at the moment, which is not much, the idea
was to be able to load it even from a non-standard path and specify that
path at configure time.  If people think that is not useful and is
actually harmful, I guess it can go.


And if I remember correctly, the dlopen approach was motivated by fact
that we didn't want to have HSA runtime as a build dependency, but rather
a run-time dependency. So it was done for packaging reasons.

Martin


Re: [committed] Skip asm goto test fails on hppa

2021-01-19 Thread Richard Sandiford via Gcc-patches
Hans-Peter Nilsson  writes:
> On Tue, 19 Jan 2021, Jakub Jelinek wrote:
>
>> On Mon, Jan 18, 2021 at 11:50:56PM -0500, Hans-Peter Nilsson wrote:
>> > On Mon, 18 Jan 2021, John David Anglin wrote:
>> > > The hppa target is a reload target and asm goto is not supported on 
>> > > reload targets.
>> > > Skip failing tests on hppa.
>> >
>> > IIUC the preferred term is "IRA target" or maybe "non-LRA
>> > target", as opposed to "LRA target".  The tests fail for
>> > cris-elf too, another IRA target, so I'd like to use that term
>> > when adjusting the dg-skip-if, hope you don't mind.
>> >
>> > But also, I'd like to xfail it instead for cris-elf, which adds
>> > a caveat: people might then think a "reload target" is not the
>> > same as an "IRA target", what with the different adjustments.
>>
>> I think "IRA target" is not the right term, all targets are IRA targets,
>> but only some are using LRA and others use the old reload.
>
> But, IRA isn't used when...

Not sure what was elided here, but the choices are IRA+reload, IRA+LRA,
or no RA at all.

FWIW, Dave's “reload target” sounded fine to me.  It seems a bit more
precise than “non-LRA target”, which if taken literally would include
no-RA targets.

> Whatever; LRA and non-LRA then.
> I'll call the suggested testsuite target-supports predicate
> "lra" (thanks Andreas S.; of course!), to be used negated here.

Sounds good.

Thanks,
Richard


Re: [PATCH] fwprop: Allow (subreg (mem)) simplifications

2021-01-19 Thread Ilya Leoshkevich via Gcc-patches
On Tue, 2021-01-19 at 09:41 +0100, Richard Biener wrote:
> On Mon, Jan 18, 2021 at 11:04 PM Ilya Leoshkevich via Gcc-patches
>  wrote:
> > 
> Suppose we have:
> > 
> > (set (reg/v:TF 63) (mem/c:TF (reg/v:DI 62)))
> > (set (reg:FPRX2 66) (subreg:FPRX2 (reg/v:TF 63) 0))
> > 
> > It is clearly profitable to propagate the first insn into the
> > second
> > one and get:
> > 
> > (set (reg:FPRX2 66) (mem/c:FPRX2 (reg/v:DI 62)))
> > 
> > fwprop actually manages to perform this, but doesn't think the
> > result is
> > worth it, which results in unnecessary store/load sequences on
> > s390.
> > Improve the situation by classifying SUBREG -> MEM changes as
> > profitable.
> 
> IIRC fwprop also propagates into multiple uses and replacing a non-
> MEM
> with a MEM is only good when the original MEM goes away - is that
> properly
> dealt with here?

This is because of efficiency and not correctness reasons, right?  For
c
orrectness I already check MEM_VOLATILE_P (new_rtx).  For efficiency I
t
hink it would be reasonable to add def_insn->num_uses () == 1 check
(thi
s passes my tests, I'm yet to do a full regtest though).  What do
you
think about this?



[PATCH] middle-end/98638 - avoid SSA reference to stmts after SSA deconstruction

2021-01-19 Thread Richard Biener
Since SSA names do leak into global tree data structures like
TYPE_SIZE or in this case GFC_DECL_SAVED_DESCRIPTOR because of
frontend bugs we have to be careful to wipe references to the
CFG when we deconstruct SSA form because we now do ggc_free that.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-01-19  Richard Biener  

PR middle-end/98638
* tree-ssanames.c (fini_ssanames): Zero SSA_NAME_DEF_STMT.
---
 gcc/tree-ssanames.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index c293cc44189..51a26d2fce1 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -102,6 +102,14 @@ init_ssanames (struct function *fn, int size)
 void
 fini_ssanames (struct function *fn)
 {
+  unsigned i;
+  tree name;
+  /* Some SSA names leak into global tree data structures so we can't simply
+ ggc_free them.  But make sure to clear references to stmts since we now
+ ggc_free the CFG itself.  */
+  FOR_EACH_VEC_SAFE_ELT (SSANAMES (fn), i, name)
+if (name)
+  SSA_NAME_DEF_STMT (name) = NULL;
   vec_free (SSANAMES (fn));
   vec_free (FREE_SSANAMES (fn));
   vec_free (FREE_SSANAMES_QUEUE (fn));
-- 
2.26.2


[PATCH] ipa/98330 - avoid ICEing on call indirect call

2021-01-19 Thread Richard Biener
The following avoids ICEing on a indirect calls with a fnspec
in modref analysis.

Bootstrap & regtest running on x86_64-unknown-linux-gnu - OK?

Thanks,
Richard.

2021-01-19  Richard Biener  

PR ipa/98330
* ipa-modref.c (analyze_stmt): Only record a summary for a
direct call.

* g++.dg/pr98330.C: New testcase.
* gcc.dg/pr98330.c: Likewise.
---
 gcc/ipa-modref.c   | 12 +++-
 gcc/testsuite/g++.dg/pr98330.C |  7 +++
 gcc/testsuite/gcc.dg/pr98330.c |  7 +++
 3 files changed, 21 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr98330.C
 create mode 100644 gcc/testsuite/gcc.dg/pr98330.c

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index 74ad876cf58..8a5669c7f9b 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -1247,11 +1247,13 @@ analyze_stmt (modref_summary *summary, 
modref_summary_lto *summary_lto,
&& (!fnspec.global_memory_read_p ()
|| !fnspec.global_memory_written_p ()))
  {
-   fnspec_summaries->get_create
-(cgraph_node::get (current_function_decl)->get_edge (stmt))
-   ->fnspec = xstrdup (fnspec.get_str ());
-   if (dump_file)
- fprintf (dump_file, "  Recorded fnspec %s\n", fnspec.get_str ());
+   cgraph_edge *e = cgraph_node::get (current_function_decl)->get_edge 
(stmt);
+   if (e->callee)
+ {
+   fnspec_summaries->get_create (e)->fnspec = xstrdup 
(fnspec.get_str ());
+   if (dump_file)
+ fprintf (dump_file, "  Recorded fnspec %s\n", fnspec.get_str 
());
+ }
  }
   }
  return true;
diff --git a/gcc/testsuite/g++.dg/pr98330.C b/gcc/testsuite/g++.dg/pr98330.C
new file mode 100644
index 000..08bf77b5c4b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr98330.C
@@ -0,0 +1,7 @@
+// { dg-do compile }
+// { dg-options -O2 }
+
+float f (float x)
+{
+  return __builtin_pow[1] (x, 2); // { dg-warning "pointer to a function used 
in arithmetic" }
+}
diff --git a/gcc/testsuite/gcc.dg/pr98330.c b/gcc/testsuite/gcc.dg/pr98330.c
new file mode 100644
index 000..bc68a6fa214
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr98330.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+float f (__typeof (__builtin_pow) fn, float x)
+{
+  return fn (x, 2);
+}
-- 
2.26.2


Re: [PATCH] fwprop: Allow (subreg (mem)) simplifications

2021-01-19 Thread Richard Biener via Gcc-patches
On Tue, Jan 19, 2021 at 2:13 PM Ilya Leoshkevich  wrote:
>
> On Tue, 2021-01-19 at 09:41 +0100, Richard Biener wrote:
> > On Mon, Jan 18, 2021 at 11:04 PM Ilya Leoshkevich via Gcc-patches
> >  wrote:
> > >
> > Suppose we have:
> > >
> > > (set (reg/v:TF 63) (mem/c:TF (reg/v:DI 62)))
> > > (set (reg:FPRX2 66) (subreg:FPRX2 (reg/v:TF 63) 0))
> > >
> > > It is clearly profitable to propagate the first insn into the
> > > second
> > > one and get:
> > >
> > > (set (reg:FPRX2 66) (mem/c:FPRX2 (reg/v:DI 62)))
> > >
> > > fwprop actually manages to perform this, but doesn't think the
> > > result is
> > > worth it, which results in unnecessary store/load sequences on
> > > s390.
> > > Improve the situation by classifying SUBREG -> MEM changes as
> > > profitable.
> >
> > IIRC fwprop also propagates into multiple uses and replacing a non-
> > MEM
> > with a MEM is only good when the original MEM goes away - is that
> > properly
> > dealt with here?
>
> This is because of efficiency and not correctness reasons, right?

Yes.

>  For
> c
> orrectness I already check MEM_VOLATILE_P (new_rtx).
>  For efficiency I
> t
> hink it would be reasonable to add def_insn->num_uses () == 1 check
> (thi
> s passes my tests, I'm yet to do a full regtest though).  What do
> you
> think about this?

I'm not too familiar with fwprop so will leave that to the actual reviewer.

Richard.


Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-19 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 19, 2021 at 12:38:47PM +, Richard Sandiford via Gcc-patches 
wrote:
> > actually only the lower 16bits are needed, the original insn is like
> >
> > .294.r.ira
> > (insn 69 68 70 13 (set (reg:HI 96 [ _52 ])
> > (subreg:HI (reg:DI 82 [ var_6.0_1 ]) 0)) "test.c":21:23 76
> > {*movhi_internal}
> >  (nil))
> > (insn 78 75 82 13 (set (reg:V4HI 140 [ _283 ])
> > (vec_duplicate:V4HI (truncate:HI (subreg:SI (reg:HI 96 [ _52
> > ]) 0 1412 {*vec_dupv4hi}
> >  (nil))
> >
> > .295r.reload
> > (insn 69 68 70 13 (set (reg:HI 5 di [orig:96 _52 ] [96])
> > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
> > {*movhi_internal}
> >  (nil))
> > (insn 489 75 78 13 (set (reg:SI 22 xmm2 [297])
> > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
> >  (nil))
> > (insn 78 489 490 13 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
> > (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297]
> > 1412 {*vec_dupv4hi}
> >  (nil))
> >
> > and insn 489 is created by lra/reload which seems ok for the sequence,
> > but problemistic with considering the logic of hardreg_cprop.
> 
> It looks OK even with the regcprop behaviour though:
> 
> - insn 69 defines only the low 16 bits of di,
> - insn 489 defines only the low 16 bits of xmm2, but copies bits 16-31
>   too (with unknown contents)
> - insn 78 uses only the low 16 bits of xmm2 (the unknown contents
>   introduced by insn 489 are truncated away)
> 
> So where do bits 16-31 become significant?  What goes wrong if they're
> not zero?

The k0 register is initialized I believe with
(insn 20 2 21 2 (set (reg:DI 68 k0 [orig:82 var_6.0_1 ] [82])
(mem/c:DI (symbol_ref:DI ("var_6") [flags 0x40]  ) [3 var_6+0 S8 A64])) "pr98694.C":21:10 74 
{*movdi_internal}
 (nil))
and so it contains all 64-bits, and then the code sometimes uses all the
bits, sometimes just the low 16-bits and sometimes low 32-bits of that
value.
(insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
(reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":27:23 76 
{*movhi_internal}
 (nil))
(insn 74 73 75 12 (set (reg:SI 36 r8 [orig:149 _52 ] [149])
(zero_extend:SI (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82]))) 144 
{*zero_extendhisi2}
 (nil))
(insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
(reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
 (nil))
(insn 78 489 490 12 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
(vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297] 1412 
{*vec_dupv4hi}
 (expr_list:REG_DEAD (reg:SI 22 xmm2 [297])
(nil)))
are examples when it uses only the low 16 bits from that, and
(insn 487 72 73 12 (set (reg:SI 1 dx [148])
(reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal}
 (nil))

(insn 85 84 491 13 (set (reg:SI 37 r9 [orig:86 _11 ] [86])
(reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":28:14 75 
{*movsi_internal}
 (nil))

(insn 491 85 88 13 (set (reg:SI 3 bx [299])
(reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal}
 (nil))
(insn 88 491 89 13 (set (reg:CCNO 17 flags)
(compare:CCNO (reg:SI 3 bx [299])
(const_int 0 [0]))) 7 {*cmpsi_ccno_1}
 (expr_list:REG_DEAD (reg:SI 3 bx [299])
(nil)))

(insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
(reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
{*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
(nil)))
are examples where it uses low 32-bits from k0.
So the
 (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
-(reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
{*movsi_internal}
- (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
+(reg:SI 22 xmm2 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
{*movsi_internal}
+ (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
 (nil)))
cprop_hardreg change indeed looks bogus, while xmm2 has SImode, it holds
only the low 16-bits of the value and has the upper bits undefined, while r9
it is replacing had all of the low 32-bits well defined.

Jakub



c++: Remove unused fn

2021-01-19 Thread Nathan Sidwell

Martin pointed out this is found with some warning modes.

I had two overloads of a function, but only one was needed.  Let's keep
the constant one.

gcc/cp/
* modules.cc (identifier): Merge overloads.

--
Nathan Sidwell
diff --git i/gcc/cp/module.cc w/gcc/cp/module.cc
index 1fd0bcfe3eb..e9a5eaeb4b4 100644
--- i/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -276,13 +276,10 @@ static inline cpp_hashnode *cpp_node (tree id)
 {
   return CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (id));
 }
-static inline tree identifier (cpp_hashnode *node)
-{
-  return HT_IDENT_TO_GCC_IDENT (HT_NODE (node));
-}
-static inline const_tree identifier (const cpp_hashnode *node)
+
+static inline tree identifier (const cpp_hashnode *node)
 {
-  return identifier (const_cast  (node));
+  return HT_IDENT_TO_GCC_IDENT (HT_NODE (const_cast (node)));
 }
 
 /* During duplicate detection we need to tell some comparators that


[committed] Fix dwarf-float.c test in testsuite

2021-01-19 Thread Jeff Law via Gcc-patches
The change to dwarf5 exposed multiple problems with
gcc.dg/debug/dwarf2/dwarf-float.c


AFAICT the test is supposed to check the dwarf2 records for float,
double and long double.  In particular it checks the sizes of those
types and ensures they are 4, 8 and 16 bytes long respectively.

Of course we have targets where that is not true.  Just to pick one,
visium-elf where a long double is just 8 bytes long.  But the test is
passing, why?

/* { dg-final { scan-assembler "0x10.*DW_AT_byte_size" } } */

The ".*" should be ringing alarm bells.  .* can match multiple lines. 
And I'm pretty sure that's precisely what's happening.  We're matching
0x10 from one line and DW_AT_byte_size from another later line.  So the
first thing this patch does is fix all the patterns to use [^\\r\\n]* to
ensure we don't match from multiple lines.  I went back and then
verified that the long double test fails with the old compiler on
visium-elf.  Good.


Next, the test makes assumptions about the sizes of float, double and
long double.  We have good target selectors for the size of double and
long double.  So we add the appropriate target selectors on those lines.

Finally with dwarf-5 by default the DW_AT_encoding test fails because of
changes in the default debug format, so we force the test to run with
dwarf-4.

With those three changes dwarf-float.c should be working again.  I've
tested it on a dozen or so targets that were previously failing in my
tester and x86-64 by hand which passes before/after this change.

Installing on the trunk,
Jeff




commit 8227106f5668c8fb1f0c5d2026e44cc0b84ee991
Author: Jeff Law 
Date:   Tue Jan 19 08:35:55 2021 -0700

[committed] Fix dwarf-float.c test in testsuite

gcc/testsuite
* gcc.dg/debug/dwarf2/dwarf-float.c: Force dwarf-4 generation
and update expected output.

diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-float.c 
b/gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-float.c
index f4883842b84..51f5977db93 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-float.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/dwarf-float.c
@@ -1,11 +1,11 @@
 /* Verify the DWARF encoding of C99 floating point types.  */
 
 /* { dg-do compile } */
-/* { dg-options "-O0 -gdwarf -dA" } */
-/* { dg-final { scan-assembler "0x4.*DW_AT_encoding" } } */
-/* { dg-final { scan-assembler "0x4.*DW_AT_byte_size" } } */
-/* { dg-final { scan-assembler "0x8.*DW_AT_byte_size" } } */
-/* { dg-final { scan-assembler "0x10.*DW_AT_byte_size" } } */
+/* { dg-options "-O0 -gdwarf-4 -dA" } */
+/* { dg-final { scan-assembler "0x4\[^\\r\\n]*DW_AT_encoding" } } */
+/* { dg-final { scan-assembler "0x4\[^\\r\\n]*DW_AT_byte_size" } } */
+/* { dg-final { scan-assembler "0x8\[^\\r\\n]*DW_AT_byte_size" { target 
double64 } } } */
+/* { dg-final { scan-assembler "0x10\[^\\r\\n]*DW_AT_byte_size" { target 
longdouble128 }} } */
 
 void foo ()
 {


[GCC9 backport] AArch64: Fix symbol offset limit (PR 98618)

2021-01-19 Thread Wilco Dijkstra via Gcc-patches
In aarch64_classify_symbol symbols are allowed large offsets on relocations.
This means the offset can use all of the +/-4GB offset, leaving no offset
available for the symbol itself.  This results in relocation overflow and
link-time errors for simple expressions like &global_array + 0xff00.

To avoid this, unless the offset_within_block_p is true, limit the offset
to +/-1MB so that the symbol needs to be within a 3.9GB offset from its
references.  For the tiny code model use a 64KB offset, allowing most of
the 1MB range for code/data between the symbol and its references.

gcc/
PR target/98618
* config/aarch64/aarch64.c (aarch64_classify_symbol):
Apply reasonable limit to symbol offsets.

testsuite/
PR target/98616
* gcc.target/aarch64/symbol-range.c: Improve testcase.
* gcc.target/aarch64/symbol-range-tiny.c: Likewise.

From-SVN: r277068
(cherry picked from commit 7d3b27ff12610fde9d6c4b56abc70c6ee9b6b3db)

---

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 
5a8acf8607a735d241c8849a449c000996d96931..6a42c06f04730d0bc55294f0eca59b9711952a2e
 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2021-01-12  Wilco Dijkstra  
+
+   Backport from mainline:
+   2019-10-16  Wilco Dijkstra  
+
+   PR target/98618
+   * config/aarch64/aarch64.c (aarch64_classify_symbol):
+   Apply reasonable limit to symbol offsets.
+
 2021-01-06  2019-07-10  Marc Glisse  
 
PR testsuite/90806
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
9189777da7e120b9a0912d8501f2da8db144d334..bce50aea01e58f72ab59b30ce43969b48f5ca1b1
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -13297,26 +13297,31 @@ aarch64_classify_symbol (rtx x, HOST_WIDE_INT offset)
 the offset does not cause overflow of the final address.  But
 we have no way of knowing the address of symbol at compile time
 so we can't accurately say if the distance between the PC and
-symbol + offset is outside the addressible range of +/-1M in the
-TINY code model.  So we rely on images not being greater than
-1M and cap the offset at 1M and anything beyond 1M will have to
-be loaded using an alternative mechanism.  Furthermore if the
-symbol is a weak reference to something that isn't known to
-resolve to a symbol in this module, then force to memory.  */
- if ((SYMBOL_REF_WEAK (x)
-  && !aarch64_symbol_binds_local_p (x))
- || !IN_RANGE (offset, -1048575, 1048575))
+symbol + offset is outside the addressible range of +/-1MB in the
+TINY code model.  So we limit the maximum offset to +/-64KB and
+assume the offset to the symbol is not larger than +/-(1MB - 64KB).
+If offset_within_block_p is true we allow larger offsets.
+Furthermore force to memory if the symbol is a weak reference to
+something that doesn't resolve to a symbol in this module.  */
+
+ if (SYMBOL_REF_WEAK (x) && !aarch64_symbol_binds_local_p (x))
return SYMBOL_FORCE_TO_MEM;
+ if (!(IN_RANGE (offset, -0x1, 0x1)
+   || offset_within_block_p (x, offset)))
+   return SYMBOL_FORCE_TO_MEM;
+
  return SYMBOL_TINY_ABSOLUTE;
 
case AARCH64_CMODEL_SMALL:
  /* Same reasoning as the tiny code model, but the offset cap here is
-4G.  */
- if ((SYMBOL_REF_WEAK (x)
-  && !aarch64_symbol_binds_local_p (x))
- || !IN_RANGE (offset, HOST_WIDE_INT_C (-4294967263),
-   HOST_WIDE_INT_C (4294967264)))
+1MB, allowing +/-3.9GB for the offset to the symbol.  */
+
+ if (SYMBOL_REF_WEAK (x) && !aarch64_symbol_binds_local_p (x))
return SYMBOL_FORCE_TO_MEM;
+ if (!(IN_RANGE (offset, -0x10, 0x10)
+   || offset_within_block_p (x, offset)))
+   return SYMBOL_FORCE_TO_MEM;
+
  return SYMBOL_SMALL_ABSOLUTE;
 
case AARCH64_CMODEL_TINY_PIC:
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 
835bf14107ffdef3b700dc00399a2d03e298f062..e25e5d85063e9e5bffa932dccfa19d9ec2550606
 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,12 @@
+2021-01-12  Wilco Dijkstra  
+
+   Backported from mainline:
+   2019-10-16  Wilco Dijkstra  
+
+   PR target/98618
+   * gcc.target/aarch64/symbol-range.c: Improve testcase.
+   * gcc.target/aarch64/symbol-range-tiny.c: Likewise.
+
 2021-01-07  Paul Thomas  
 
Backported from master:
diff --git a/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c 
b/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c
index 
d7e46b059e41f2672b3a1da5506fa8944e752e01..fc6a4f3ec780d9fa86de1c8e1a42a55992ee8b2d
 100644
--- a/gcc/testsuite/gcc.target/aarch64/symbol-range-ti

Re: [committed] Skip asm goto test fails on hppa

2021-01-19 Thread Jeff Law via Gcc-patches



On 1/19/21 5:36 AM, Hans-Peter Nilsson wrote:
>
> On Tue, 19 Jan 2021, Jakub Jelinek wrote:
>
>> On Mon, Jan 18, 2021 at 11:50:56PM -0500, Hans-Peter Nilsson wrote:
>>> On Mon, 18 Jan 2021, John David Anglin wrote:
 The hppa target is a reload target and asm goto is not supported on reload 
 targets.
 Skip failing tests on hppa.
>>> IIUC the preferred term is "IRA target" or maybe "non-LRA
>>> target", as opposed to "LRA target".  The tests fail for
>>> cris-elf too, another IRA target, so I'd like to use that term
>>> when adjusting the dg-skip-if, hope you don't mind.
>>>
>>> But also, I'd like to xfail it instead for cris-elf, which adds
>>> a caveat: people might then think a "reload target" is not the
>>> same as an "IRA target", what with the different adjustments.
>> I think "IRA target" is not the right term, all targets are IRA targets,
>> but only some are using LRA and others use the old reload.
> But, IRA isn't used when...  Whatever; LRA and non-LRA then.
> I'll call the suggested testsuite target-supports predicate
> "lra" (thanks Andreas S.; of course!), to be used negated here.
Well we could do that.  But instead I propose we drop reload for gcc-12
now that cc0 is dead (yea, yea, we need to get the avr conversion
reviewed first...)

Jeff



Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]

2021-01-19 Thread Richard Biener
On Mon, 18 Jan 2021, Richard Sandiford wrote:

> Jan Hubicka  writes:
> >> >> 
> >> >> Well, in tree-ssa code we do assume these to be either disjoint objects
> >> >> or equal (in decl_refs_may_alias_p that continues in case
> >> >> compare_base_decls is -1).  I am not sure if we win much by threating
> >> >> them differently on RTL level. I would preffer staying consistent here.
> >> 
> >> Yeah, I see your point.  My concern here was that the fallback case
> >> applies to SYMBOL_REFs without decls, which might not have been visible
> >> at the tree-ssa level.  E.g. they might be ABI-defined symbols that have
> >> no known relation to source-level constructs.
> >> 
> >> E.g. the small-data base symbol _gp on MIPS points at a fixed offset
> >> from the start of the small-data area (0x7ff0 IIRC).  If the target
> >> generated rtl code that used _gp directly, we could wrongly assume
> >> that _gp+X can't alias BASE+Y when X != Y, even though the real test
> >> for small-data BASEs would be whether X + 0x7ff0 != Y.
> >> 
> >> I don't think that could occur in tree-ssa.  No valid C code would
> >> be able to refer directly to _gp in this way.
> >> 
> >> On the other hand, I don't have a specific example of where this does
> >> go wrong, it's just a feeling that it might.  I can drop it if you
> >> think that's better.
> >
> > I would lean towards not disabling optimization when we have no good
> > reason for that - we already did it bit too many times in aliasing code
> > and it is hard to figure out what optimizations are missed purposefully
> > and what are missed just as omission.
> >
> > We already comitted to a very conservative assumption that every
> > external symbol can be alias of another. I think we should have
> > originally required units that reffers to same memory location via
> > different symbols to declare it explicitly (i.e. make external alias to
> > external symbol), but we do not even allow external aliases (symtab
> > supports that though) and also it may depend on use of the module what
> > symbols are aliased.
> >
> > We also decided to disable TBAA for direct accesses to decls to allow
> > type punning using unions.
> >
> > This keeps the offset+range check to be only means of disambiguation.
> > While for modern programs global arrays are not common, for Fortran
> > stuff they are, so I would preffer to not cripple them even more.
> > (I am not sure how often the arrays are external though)
> 
> OK, the version below drops the new -2 return value and tries to
> clarify the comments in compare_base_symbol_refs.
> 
> Lightly tested on aarch64-linux-gnu so far.  Does it look OK if
> full tests pass?

OK from my side.

Richard.

> Thanks,
> Richard
> 
> 
> 
> memrefs_conflict_p assumes that:
> 
>   [XB + XO, XB + XO + XS)
> 
> does not alias
> 
>   [YB + YO, YB + YO + YS)
> 
> whenever:
> 
>   [XO, XO + XS)
> 
> does not intersect
> 
>   [YO, YO + YS)
> 
> In other words, the accesses can alias only if XB == YB at runtime.
> 
> However, this doesn't cope correctly with section anchors.
> For example, if XB is an anchor symbol and YB is at offset
> XO from the anchor, then:
> 
>   [XB + XO, XB + XO + XS)
> 
> overlaps
> 
>   [YB, YB + YS)
> 
> whatever the value of XO is.  In other words, when doing the
> alias check for two symbols whose local definitions are in
> the same block, we should apply the known difference between
> their block offsets to the intersection test above.
> 
> gcc/
>   PR rtl-optimization/92294
>   * alias.c (compare_base_symbol_refs): Take an extra parameter
>   and add the distance between two symbols to it.  Enshrine in
>   comments that -1 means "either 0 or 1, but we can't tell
>   which at compile time".
>   (memrefs_conflict_p): Update call accordingly.
>   (rtx_equal_for_memref_p): Likewise.  Take the distance between symbols
>   into account.
> ---
>  gcc/alias.c | 47 +++
>  1 file changed, 31 insertions(+), 16 deletions(-)
> 
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 8d3575e4e27..69e1eb89ac6 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -159,7 +159,8 @@ static tree decl_for_component_ref (tree);
>  static int write_dependence_p (const_rtx,
>  const_rtx, machine_mode, rtx,
>  bool, bool, bool);
> -static int compare_base_symbol_refs (const_rtx, const_rtx);
> +static int compare_base_symbol_refs (const_rtx, const_rtx,
> +  HOST_WIDE_INT * = NULL);
>  
>  static void memory_modified_1 (rtx, const_rtx, void *);
>  
> @@ -1837,7 +1838,11 @@ rtx_equal_for_memref_p (const_rtx x, const_rtx y)
>return label_ref_label (x) == label_ref_label (y);
>  
>  case SYMBOL_REF:
> -  return compare_base_symbol_refs (x, y) == 1;
> +  {
> + HOST_WIDE_INT distance = 0;
> + return (compare_base_symbol_refs (x, y, &distance) == 1
> + && distance == 0);
> +  }
>

Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-19 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek via Gcc-patches  writes:
> On Tue, Jan 19, 2021 at 12:38:47PM +, Richard Sandiford via Gcc-patches 
> wrote:
>> > actually only the lower 16bits are needed, the original insn is like
>> >
>> > .294.r.ira
>> > (insn 69 68 70 13 (set (reg:HI 96 [ _52 ])
>> > (subreg:HI (reg:DI 82 [ var_6.0_1 ]) 0)) "test.c":21:23 76
>> > {*movhi_internal}
>> >  (nil))
>> > (insn 78 75 82 13 (set (reg:V4HI 140 [ _283 ])
>> > (vec_duplicate:V4HI (truncate:HI (subreg:SI (reg:HI 96 [ _52
>> > ]) 0 1412 {*vec_dupv4hi}
>> >  (nil))
>> >
>> > .295r.reload
>> > (insn 69 68 70 13 (set (reg:HI 5 di [orig:96 _52 ] [96])
>> > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
>> > {*movhi_internal}
>> >  (nil))
>> > (insn 489 75 78 13 (set (reg:SI 22 xmm2 [297])
>> > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
>> >  (nil))
>> > (insn 78 489 490 13 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
>> > (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297]
>> > 1412 {*vec_dupv4hi}
>> >  (nil))
>> >
>> > and insn 489 is created by lra/reload which seems ok for the sequence,
>> > but problemistic with considering the logic of hardreg_cprop.
>> 
>> It looks OK even with the regcprop behaviour though:
>> 
>> - insn 69 defines only the low 16 bits of di,
>> - insn 489 defines only the low 16 bits of xmm2, but copies bits 16-31
>>   too (with unknown contents)
>> - insn 78 uses only the low 16 bits of xmm2 (the unknown contents
>>   introduced by insn 489 are truncated away)
>> 
>> So where do bits 16-31 become significant?  What goes wrong if they're
>> not zero?
>
> The k0 register is initialized I believe with
> (insn 20 2 21 2 (set (reg:DI 68 k0 [orig:82 var_6.0_1 ] [82])
> (mem/c:DI (symbol_ref:DI ("var_6") [flags 0x40]   0x7f7babeaaf30 var_6>) [3 var_6+0 S8 A64])) "pr98694.C":21:10 74 
> {*movdi_internal}
>  (nil))
> and so it contains all 64-bits, and then the code sometimes uses all the
> bits, sometimes just the low 16-bits and sometimes low 32-bits of that
> value.
> (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
> (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":27:23 76 
> {*movhi_internal}
>  (nil))
> (insn 74 73 75 12 (set (reg:SI 36 r8 [orig:149 _52 ] [149])
> (zero_extend:SI (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82]))) 144 
> {*zero_extendhisi2}
>  (nil))
> (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
> (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
>  (nil))
> (insn 78 489 490 12 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
> (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297] 1412 
> {*vec_dupv4hi}
>  (expr_list:REG_DEAD (reg:SI 22 xmm2 [297])
> (nil)))
> are examples when it uses only the low 16 bits from that, and
> (insn 487 72 73 12 (set (reg:SI 1 dx [148])
> (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal}
>  (nil))
>
> (insn 85 84 491 13 (set (reg:SI 37 r9 [orig:86 _11 ] [86])
> (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":28:14 75 
> {*movsi_internal}
>  (nil))
>
> (insn 491 85 88 13 (set (reg:SI 3 bx [299])
> (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal}
>  (nil))
> (insn 88 491 89 13 (set (reg:CCNO 17 flags)
> (compare:CCNO (reg:SI 3 bx [299])
> (const_int 0 [0]))) 7 {*cmpsi_ccno_1}
>  (expr_list:REG_DEAD (reg:SI 3 bx [299])
> (nil)))
>
> (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> (reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
> {*movsi_internal}
>  (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
> (nil)))
> are examples where it uses low 32-bits from k0.
> So the
>  (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> -(reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
> {*movsi_internal}
> - (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
> +(reg:SI 22 xmm2 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
> {*movsi_internal}
> + (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
>  (nil)))
> cprop_hardreg change indeed looks bogus, while xmm2 has SImode, it holds
> only the low 16-bits of the value and has the upper bits undefined, while r9
> it is replacing had all of the low 32-bits well defined.

Ah, ok, thanks for the extra context.

So AIUI the problem when recording xmm2<-di isn't just:

 [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))

but also that:

 [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode)

For example, all registers in this sequence can be part of the same chain:

(set (reg:HI R1) (reg:HI R0))
(set (reg:SI R2) (reg:SI R1)) // [A]
(set (reg:DI R3) (reg:DI R2)) // [A]
(set (reg:SI R4) (reg:SI R[0-3]))
(set (reg:HI R5) (reg:HI R[0-4]))

But:

(set (reg:SI R1) (reg:SI R0))
(set (reg:HI R2) (reg:HI R1))
(set (reg:SI R

Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]

2021-01-19 Thread Jan Hubicka
> On Mon, 18 Jan 2021, Richard Sandiford wrote:
> 
> > Jan Hubicka  writes:
> > >> >> 
> > >> >> Well, in tree-ssa code we do assume these to be either disjoint 
> > >> >> objects
> > >> >> or equal (in decl_refs_may_alias_p that continues in case
> > >> >> compare_base_decls is -1).  I am not sure if we win much by threating
> > >> >> them differently on RTL level. I would preffer staying consistent 
> > >> >> here.
> > >> 
> > >> Yeah, I see your point.  My concern here was that the fallback case
> > >> applies to SYMBOL_REFs without decls, which might not have been visible
> > >> at the tree-ssa level.  E.g. they might be ABI-defined symbols that have
> > >> no known relation to source-level constructs.
> > >> 
> > >> E.g. the small-data base symbol _gp on MIPS points at a fixed offset
> > >> from the start of the small-data area (0x7ff0 IIRC).  If the target
> > >> generated rtl code that used _gp directly, we could wrongly assume
> > >> that _gp+X can't alias BASE+Y when X != Y, even though the real test
> > >> for small-data BASEs would be whether X + 0x7ff0 != Y.
> > >> 
> > >> I don't think that could occur in tree-ssa.  No valid C code would
> > >> be able to refer directly to _gp in this way.
> > >> 
> > >> On the other hand, I don't have a specific example of where this does
> > >> go wrong, it's just a feeling that it might.  I can drop it if you
> > >> think that's better.
> > >
> > > I would lean towards not disabling optimization when we have no good
> > > reason for that - we already did it bit too many times in aliasing code
> > > and it is hard to figure out what optimizations are missed purposefully
> > > and what are missed just as omission.
> > >
> > > We already comitted to a very conservative assumption that every
> > > external symbol can be alias of another. I think we should have
> > > originally required units that reffers to same memory location via
> > > different symbols to declare it explicitly (i.e. make external alias to
> > > external symbol), but we do not even allow external aliases (symtab
> > > supports that though) and also it may depend on use of the module what
> > > symbols are aliased.
> > >
> > > We also decided to disable TBAA for direct accesses to decls to allow
> > > type punning using unions.
> > >
> > > This keeps the offset+range check to be only means of disambiguation.
> > > While for modern programs global arrays are not common, for Fortran
> > > stuff they are, so I would preffer to not cripple them even more.
> > > (I am not sure how often the arrays are external though)
> > 
> > OK, the version below drops the new -2 return value and tries to
> > clarify the comments in compare_base_symbol_refs.
> > 
> > Lightly tested on aarch64-linux-gnu so far.  Does it look OK if
> > full tests pass?
> 
> OK from my side.

OK too, thanks!
Honza
> 
> Richard.
> 
> > Thanks,
> > Richard
> > 
> > 
> > 
> > memrefs_conflict_p assumes that:
> > 
> >   [XB + XO, XB + XO + XS)
> > 
> > does not alias
> > 
> >   [YB + YO, YB + YO + YS)
> > 
> > whenever:
> > 
> >   [XO, XO + XS)
> > 
> > does not intersect
> > 
> >   [YO, YO + YS)
> > 
> > In other words, the accesses can alias only if XB == YB at runtime.
> > 
> > However, this doesn't cope correctly with section anchors.
> > For example, if XB is an anchor symbol and YB is at offset
> > XO from the anchor, then:
> > 
> >   [XB + XO, XB + XO + XS)
> > 
> > overlaps
> > 
> >   [YB, YB + YS)
> > 
> > whatever the value of XO is.  In other words, when doing the
> > alias check for two symbols whose local definitions are in
> > the same block, we should apply the known difference between
> > their block offsets to the intersection test above.
> > 
> > gcc/
> > PR rtl-optimization/92294
> > * alias.c (compare_base_symbol_refs): Take an extra parameter
> > and add the distance between two symbols to it.  Enshrine in
> > comments that -1 means "either 0 or 1, but we can't tell
> > which at compile time".
> > (memrefs_conflict_p): Update call accordingly.
> > (rtx_equal_for_memref_p): Likewise.  Take the distance between symbols
> > into account.
> > ---
> >  gcc/alias.c | 47 +++
> >  1 file changed, 31 insertions(+), 16 deletions(-)
> > 
> > diff --git a/gcc/alias.c b/gcc/alias.c
> > index 8d3575e4e27..69e1eb89ac6 100644
> > --- a/gcc/alias.c
> > +++ b/gcc/alias.c
> > @@ -159,7 +159,8 @@ static tree decl_for_component_ref (tree);
> >  static int write_dependence_p (const_rtx,
> >const_rtx, machine_mode, rtx,
> >bool, bool, bool);
> > -static int compare_base_symbol_refs (const_rtx, const_rtx);
> > +static int compare_base_symbol_refs (const_rtx, const_rtx,
> > +HOST_WIDE_INT * = NULL);
> >  
> >  static void memory_modified_1 (rtx, const_rtx, void *);
> >  
> > @@ -1837,7 +1838,11 @@ rtx_equal_for_memref_p (const_rtx x, const_rtx y)
> >return label_ref_l

[PATCH] c++: Fix tsubsting CLASS_PLACEHOLDER_TEMPLATE [PR95434]

2021-01-19 Thread Patrick Palka via Gcc-patches
Here, during partial instantiation of the generic lambda, tsubst_copy on
the CLASS_PLACEHOLDER_TEMPLATE of the CTAD placeholder U{0} yields a
(level-lowered) TEMPLATE_TEMPLATE_PARM rather than the corresponding
TEMPLATE_DECL.  This later confuses do_class_deduction which expects
that the CLASS_PLACEHOLDER_TEMPLATE is always a TEMPLATE_DECL.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

PR c++/95434
* pt.c (tsubst) : If tsubsting
CLASS_PLACEHOLDER_TEMPLATE yields a TEMPLATE_TEMPLATE_PARM,
adjust to its TEMPLATE_TEMPLATE_PARM_TEMPLATE_DECL.

gcc/testsuite/ChangeLog:

PR c++/95434
* g++.dg/cpp2a/lambda-generic9.C: New test.
---
 gcc/cp/pt.c  | 2 ++
 gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C | 9 +
 2 files changed, 11 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index d5d3d2fd040..2fed81520e3 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -15688,6 +15688,8 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
else if (tree pl = CLASS_PLACEHOLDER_TEMPLATE (t))
  {
pl = tsubst_copy (pl, args, complain, in_decl);
+   if (TREE_CODE (pl) == TEMPLATE_TEMPLATE_PARM)
+ pl = TEMPLATE_TEMPLATE_PARM_TEMPLATE_DECL (pl);
CLASS_PLACEHOLDER_TEMPLATE (r) = pl;
  }
  }
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C
new file mode 100644
index 000..20ceb370c38
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C
@@ -0,0 +1,9 @@
+// PR c++/95434
+// { dg-do compile { target c++20 } }
+
+template 
+void f() {
+  []  class U> { U{0}; };
+}
+
+template void f();
-- 
2.30.0.155.g66e871b664



Re: [committed] Skip asm goto test fails on hppa

2021-01-19 Thread Hans-Peter Nilsson
On Tue, 19 Jan 2021, Jeff Law wrote:
> On 1/19/21 5:36 AM, Hans-Peter Nilsson wrote:
> > I'll call the suggested testsuite target-supports predicate
> > "lra" (thanks Andreas S.; of course!), to be used negated here.
> Well we could do that.? But instead I propose we drop reload for gcc-12
> now that cc0 is dead (yea, yea, we need to get the avr conversion
> reviewed first...)

How about a stage fixing the documentation for LRA first, to
enable a proper transition, rather than jumping directly to the
extinction threat?  There are still omissions to the point that
you have to check the code and LRA targets to get the correct
translation.  Like, what machine-description constructs are
dead, and what's their replacement.  Is reload_in_progress dead?

brgds, H-P


[ARM] PR98636 - ICE on passing incompatible options for fp16

2021-01-19 Thread Prathamesh Kulkarni via Gcc-patches
Hi,
The attached patch fixes the issue mentioned in PR, by adding
arm_fp16_format to checked_options in optc-save-gen.awk.
Is this OK to commit in stage-4 if testing passes or should we hold it
till next stage-1 ?

Thanks,
Prathamesh


pr98636-2.diff
Description: Binary data


Re: [committed] Skip asm goto test fails on hppa

2021-01-19 Thread Hans-Peter Nilsson
On Tue, 19 Jan 2021, Richard Sandiford wrote:

> Hans-Peter Nilsson  writes:
> > On Tue, 19 Jan 2021, Jakub Jelinek wrote:
> >> I think "IRA target" is not the right term, all targets are IRA targets,
> >> but only some are using LRA and others use the old reload.
> >
> > But, IRA isn't used when...
>
> Not sure what was elided here,

...'when the that has made a transition and no longer has
transitional artefacts like "-mlra"' or something.  But, you say
it's not IRA that's replaced, but reload?  ...ah, that's it.

brgds, H-P


[PATCH] arm: [testuiste] fix ivopts.c target test [PR96372]

2021-01-19 Thread Andrea Corallo via Gcc-patches
Hi all,

this patch is for PR96372, the fail was introduced by [1] where the
failing check went from using target 'arm_thumb2' to
'arm_thumb2_ok_no_arm_v8_1_lob'.  Unfortunately this is relying on
'arm_thumb2_ok' that has a different semantic compared to the original
'arm_thumb2'.

This patch is introducing then 'arm_thumb2_no_arm_v8_1_lob' relying on
'arm_thumb2' to restore the intended behavior.

Okay for trunk?

Thanks

  Andrea

[1] 


>From 6199695364808c59202dfcb1b29df1e84dab5aa9 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Fri, 15 Jan 2021 15:34:19 +0100
Subject: [PATCH] arm: [testuiste] fix ivopts.c target test [PR96372]

gcc/
2021-01-15  Andrea Corallo  
PR target/96372
* doc/sourcebuild.texi (arm_thumb2_no_arm_v8_1_lob): Document.

gcc/testsuite/
2021-01-15  Andrea Corallo  
PR target/96372
* lib/target-supports.exp
(check_effective_target_arm_thumb2_no_arm_v8_1_lob): Define proc.
* gcc.target/arm/ivopts.c: Use target
'arm_thumb2_no_arm_v8_1_lob'.
---
 gcc/doc/sourcebuild.texi  |  5 +
 gcc/testsuite/gcc.target/arm/ivopts.c |  2 +-
 gcc/testsuite/lib/target-supports.exp | 15 ++-
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 3d0873dd074..d5cfe54304d 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2058,6 +2058,11 @@ ARM Target supports executing the Armv8.1-M Mainline Low 
Overhead Loop
 instructions @code{DLS} and @code{LE}.
 Some multilibs may be incompatible with these options.
 
+@item arm_thumb2_no_arm_v8_1_lob
+ARM target where Thumb-2 is used without options but does not support
+executing the Armv8.1-M Mainline Low Overhead Loop instructions
+@code{DLS} and @code{LE}.
+
 @item arm_thumb2_ok_no_arm_v8_1_lob
 ARM target generates Thumb-2 code for @code{-mthumb} but does not
 support executing the Armv8.1-M Mainline Low Overhead Loop
diff --git a/gcc/testsuite/gcc.target/arm/ivopts.c 
b/gcc/testsuite/gcc.target/arm/ivopts.c
index 2733e66988e..d7d72a59d9c 100644
--- a/gcc/testsuite/gcc.target/arm/ivopts.c
+++ b/gcc/testsuite/gcc.target/arm/ivopts.c
@@ -11,6 +11,6 @@ tr5 (short array[], int n)
 }
 
 /* { dg-final { scan-tree-dump-times "PHI <" 1 "ivopts"} } */
-/* { dg-final { object-size text <= 20 { target { 
arm_thumb2_ok_no_arm_v8_1_lob } } } } */
+/* { dg-final { object-size text <= 20 { target { arm_thumb2_no_arm_v8_1_lob } 
} } } */
 /* { dg-final { object-size text <= 32 { target { arm_nothumb && { ! 
arm_iwmmxt_ok } } } } } */
 /* { dg-final { object-size text <= 36 { target { arm_nothumb && arm_iwmmxt_ok 
}  } } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 47d4c45e9eb..0d351c8fbad 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -10658,7 +10658,20 @@ proc check_effective_target_arm_v8_1_lob_ok { } {
 }
 }
 
-# Return 1 is this is an ARM target where -mthumb causes Thumb-2 to be
+# Return 1 if this is an ARM target where Thumb-2 is used without
+# options added by the test and the target does not support executing
+# the Armv8.1-M Mainline Low Overhead Loop, 0 otherwise.  The test is
+# valid for ARM.
+
+proc check_effective_target_arm_thumb2_no_arm_v8_1_lob { } {
+if { [check_effective_target_arm_thumb2]
+&& ![check_effective_target_arm_v8_1_lob_ok] } {
+   return 1
+}
+return 0
+}
+
+# Return 1 if this is an ARM target where -mthumb causes Thumb-2 to be
 # used and the target does not support executing the Armv8.1-M
 # Mainline Low Overhead Loop, 0 otherwise.  The test is valid for ARM.
 
-- 
2.20.1



Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-19 Thread Michael Meissner via Gcc-patches
On Fri, Jan 15, 2021 at 03:43:13PM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jan 14, 2021 at 11:59:19AM -0500, Michael Meissner wrote:
> > >From 78435dee177447080434cdc08fc76b1029c7f576 Mon Sep 17 00:00:00 2001
> > From: Michael Meissner 
> > Date: Wed, 13 Jan 2021 21:47:03 -0500
> > Subject: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.
> > 
> > This patch replaces patches previously submitted:
> 
> What did you change after I approved it?

You grumbled about the way I converted the names from the current name to the
IEEE 128-bit name as being unclear.

1) I moved the table of known mappings from within a function to a separate
function, and I populated the switch statement with all of the current names.

2) I moved the code that looks at a built-in function's arguments and returns
whether it uses long double to a separate function rather than being buried
within a larger function.

3) I changed the code for case we we didn't provide a name (i.e. new built-ins)
to hopefully be clearer on the conversion.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] PowerPC: Add float128/Decimal conversions.

2021-01-19 Thread Michael Meissner via Gcc-patches
On Fri, Jan 15, 2021 at 03:52:44PM -0600, Segher Boessenkool wrote:
> On Thu, Jan 14, 2021 at 12:09:36PM -0500, Michael Meissner wrote:
> > [PATCH] PowerPC: Add float128/Decimal conversions.
> 
> Same question here.

In your last message, you said that it was unacceptable that the conversion
fails if the user uses an old GLIBC.  So I rewrote the code using weak
references.  If the user has at least GLIBC 2.32, it will use the IEEE 128-bit
string support in the library.

If an older GLIBC is used, I then use the IBM 128-bit format as an intermediate
value.  Obviously there are cases where IEEE 128-bit can hold values that IBM
128-bit can't (mostly due to the increased exponent range in IEEE 128-bit), but
it at least does the conversion for the numbers in the common range.

In doing this transformation, I needed to do minor edits to the main decimal
to/from binary conversion functions to allow the KF functions to be declared.
Previously, I used preprocessor magic to rename the functions.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH] aarch64: Relax flags of saturation builtins

2021-01-19 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This patch relaxes the flags for the saturating arithmetic builtins to NONE, 
allowing for more optimisation.
Bootstrapped and tested on aarch64-none-linux-gnu.

Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog

* config/aarch64/aarch64-simd-builtins.def (sqshl, uqshl, sqrshl, 
uqrshl,
sqadd, uqadd, sqsub, uqsub, suqadd, usqadd, sqmovn, uqmovn, sqxtn2,
uqxtn2, sqabs, sqneg, sqdmlal, sqdmlsl, sqdmlal_lane, sqdmlsl_lane,
sqdmlal_laneq, sqdmlsl_laneq, sqdmlal_n, sqdmlsl_n, sqdmlal2,
sqdmlsl2, sqdmlal2_lane, sqdmlsl2_lane, sqdmlal2_laneq, sqdmlsl2_laneq,
sqdmlal2_n, sqdmlsl2_n, sqdmull, sqdmull_lane, sqdmull_laneq, sqdmull_n,
sqdmull2, sqdmull2_lane, sqdmull2_laneq, sqdmull2_n, sqdmulh, sqrdmulh,
sqdmulh_lane, sqdmulh_laneq, sqrdmulh_lane, sqrdmulh_laneq, sqshrun_n,
sqrshrun_n, sqshrn_n, uqshrn_n, sqrshrn_n, uqrshrn_n, sqshlu_n, sqshl_n,
uqshl_n, sqrdmlah, sqrdmlsh, sqrdmlah_lane, sqrdmlsh_lane, 
sqrdmlah_laneq,
sqrdmlsh_laneq, sqmovun): Use NONE flags.


sat-flags.patch
Description: sat-flags.patch


Re: [committed] Skip asm goto test fails on hppa

2021-01-19 Thread Hans-Peter Nilsson
On Tue, 19 Jan 2021, Hans-Peter Nilsson wrote:
> On Tue, 19 Jan 2021, Richard Sandiford wrote:
>
> > Hans-Peter Nilsson  writes:
> > > On Tue, 19 Jan 2021, Jakub Jelinek wrote:
> > >> I think "IRA target" is not the right term, all targets are IRA targets,
> > >> but only some are using LRA and others use the old reload.
> > >
> > > But, IRA isn't used when...
> >
> > Not sure what was elided here,
>
> ...'when the that has made a transition and no longer has
> transitional artefacts like "-mlra"'

FAOD: s/that/target/.  Bah.

brgds, H-P


Re: [PATCH] aarch64: Use GCC vector extensions for integer mls intrinsics

2021-01-19 Thread Richard Sandiford via Gcc-patches
Jonathan Wright  writes:
> Hi,
>
> As subject, this patch rewrites integer mls Neon intrinsics to use
> a - b * c rather than inline assembly code, allowing for better
> scheduling and optimization.
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> If ok, please commit to master (I don't have commit rights.)

Thanks for doing this.  The patch looks good from a functional
point of view.  I guess my only concern is that things like:

a = vmla_u8 (vmulq_u8 (b, c), d, e);

would become:

a = b * c + d * e;

and I don't think anything guarantees that the user's original
choice of instructon selection will be preserved.  We might end
up with the equivalent of:

a = vmla_u8 (vmulq_u8 (d, e), b, c);

giving different latencies.

If we added built-in functions instead, we could lower them to
IFN_FMA and IFN_FNMA, which support integers as well as floats,
and which stand a better chance of preserving the original grouping.

There again, the unfused floating-point MLAs already decompose
into separate multiplies and adds (although they can't of course
use IFN_FMA).

Any thoughts on doing it that way instead?

I'm not saying the patch shouldn't go in though, just thought it
was worth asking.

Thanks,
Richard

>
> Thanks,
> Jonathan
>
> ---
>
> gcc/Changelog:
>
> 2021-01-14  Jonathan Wright  
>
> * config/aarch64/arm_neon.h (vmls_s8): Use C rather than asm.
> (vmls_s16): Likewise.
> (vmls_s32): Likewise.
> (vmls_u8): Likewise.
> (vmls_u16): Likewise.
> (vmls_u32): Likewise.
> (vmlsq_s8): Likewise.
> (vmlsq_s16): Likewise.
> (vmlsq_s32): Likewise.
> (vmlsq_u8): Likewise.
> (vmlsq_u16): Likewise.
> (vmlsq_u32): Likewise.
>
> diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
> index 
> 608e582d25820062a409310e7f3fc872660f8041..ad04eab1e753aa86f20a8f6cc2717368b1840ef7
>  100644
> --- a/gcc/config/aarch64/arm_neon.h
> +++ b/gcc/config/aarch64/arm_neon.h
> @@ -7968,72 +7968,45 @@ __extension__ extern __inline int8x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vmls_s8 (int8x8_t __a, int8x8_t __b, int8x8_t __c)
>  {
> -  int8x8_t __result;
> -  __asm__ ("mls %0.8b,%2.8b,%3.8b"
> -   : "=w"(__result)
> -   : "0"(__a), "w"(__b), "w"(__c)
> -   : /* No clobbers */);
> -  return __result;
> +  uint8x8_t __result = (uint8x8_t) __a - (uint8x8_t) __b * (uint8x8_t) __c;
> +  return (int8x8_t) __result;
>  }
>  
>  __extension__ extern __inline int16x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vmls_s16 (int16x4_t __a, int16x4_t __b, int16x4_t __c)
>  {
> -  int16x4_t __result;
> -  __asm__ ("mls %0.4h,%2.4h,%3.4h"
> -   : "=w"(__result)
> -   : "0"(__a), "w"(__b), "w"(__c)
> -   : /* No clobbers */);
> -  return __result;
> +  uint16x4_t __result = (uint16x4_t) __a - (uint16x4_t) __b * (uint16x4_t) 
> __c;
> +  return (int16x4_t) __result;
>  }
>  
>  __extension__ extern __inline int32x2_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vmls_s32 (int32x2_t __a, int32x2_t __b, int32x2_t __c)
>  {
> -  int32x2_t __result;
> -  __asm__ ("mls %0.2s,%2.2s,%3.2s"
> -   : "=w"(__result)
> -   : "0"(__a), "w"(__b), "w"(__c)
> -   : /* No clobbers */);
> -  return __result;
> +  uint32x2_t __result = (uint32x2_t) __a - (uint32x2_t) __b * (uint32x2_t) 
> __c;
> +  return (int32x2_t) __result;
>  }
>  
>  __extension__ extern __inline uint8x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vmls_u8 (uint8x8_t __a, uint8x8_t __b, uint8x8_t __c)
>  {
> -  uint8x8_t __result;
> -  __asm__ ("mls %0.8b,%2.8b,%3.8b"
> -   : "=w"(__result)
> -   : "0"(__a), "w"(__b), "w"(__c)
> -   : /* No clobbers */);
> -  return __result;
> +  return __a - __b * __c;
>  }
>  
>  __extension__ extern __inline uint16x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vmls_u16 (uint16x4_t __a, uint16x4_t __b, uint16x4_t __c)
>  {
> -  uint16x4_t __result;
> -  __asm__ ("mls %0.4h,%2.4h,%3.4h"
> -   : "=w"(__result)
> -   : "0"(__a), "w"(__b), "w"(__c)
> -   : /* No clobbers */);
> -  return __result;
> +  return __a - __b * __c;
>  }
>  
>  __extension__ extern __inline uint32x2_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vmls_u32 (uint32x2_t __a, uint32x2_t __b, uint32x2_t __c)
>  {
> -  uint32x2_t __result;
> -  __asm__ ("mls %0.2s,%2.2s,%3.2s"
> -   : "=w"(__result)
> -   : "0"(__a), "w"(__b), "w"(__c)
> -   : /* No clobbers */);
> -  return __result;
> +  return __a - __b * __c;
>  }
>  
>  #define vmlsl_high_lane_s16(a, b, c, d) \
> @@ -8565,72 +8538,45 @@ __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vmlsq_s8 (int8x16_t __a, int8x16_t __b, int8x16_t __c)
>  {
> -  int8

Re: [r11-6759 Regression] FAIL: gcc.dg/debug/dwarf2/pr41445-7.c scan-assembler DW_TAG_variable[^\\r\\n]*[\\r\\n]+[^\\r\\n]*"varj[^\\r\\n]*DW_AT_name([^\\r\\n]*[\\r\\n]+[^\\r\\n]*DW_AT_)*[^\\r\\n]*[\\r

2021-01-19 Thread David Edelsohn via Gcc-patches
Thanks for fixing this, Jeff.

I didn't realize that the testcase was testing the explicit source
code line number hard coded the expected DWARF output.

It would have been nice if the testcase included a comment about the
purpose and to alert people that the expected output needs to change
if lines are added to the file.

Thanks, David

On Mon, Jan 18, 2021 at 4:40 PM sunil.k.pandey  wrote:
>
> On Linux/x86_64,
>
> b654d23a470af25442e496ba62b5558e7c3ff1e6 is the first bad commit
> commit b654d23a470af25442e496ba62b5558e7c3ff1e6
> Author: David Edelsohn 
> Date:   Sun Jan 17 18:18:56 2021 -0500
>
> testsuite: Skip DWARF 5 testcases on AIX.
>
> caused
>
> FAIL: gcc.dg/debug/dwarf2/pr41445-7.c scan-assembler 
> DW_TAG_variable[^\\r\\n]*[\\r\\n]+[^\\r\\n]*"vari[^\\r\\n]*DW_AT_name([^\\r\\n]*[\\r\\n]+[^\\r\\n]*DW_AT_)*[^\\r\\n]*[\\r\\n]+[^\\r\\n]*[^\\r\\n]*DW_AT_decl_line
>  \\((0xa|10)\\)
> FAIL: gcc.dg/debug/dwarf2/pr41445-7.c scan-assembler 
> DW_TAG_variable[^\\r\\n]*[\\r\\n]+[^\\r\\n]*"varj[^\\r\\n]*DW_AT_name([^\\r\\n]*[\\r\\n]+[^\\r\\n]*DW_AT_)*[^\\r\\n]*[\\r\\n]+[^\\r\\n]*[^\\r\\n]*DW_AT_decl_line
>  \\((0xa|10)\\)
>
> with GCC configured with
>
> ../../gcc/configure 
> --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-6759/usr
>  --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
> --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
> --enable-libmpx x86_64-linux --disable-bootstrap
>
> To reproduce:
>
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dwarf2.exp=gcc.dg/debug/dwarf2/pr41445-7.c 
> --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dwarf2.exp=gcc.dg/debug/dwarf2/pr41445-7.c 
> --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dwarf2.exp=gcc.dg/debug/dwarf2/pr41445-7.c 
> --target_board='unix{-m64}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dwarf2.exp=gcc.dg/debug/dwarf2/pr41445-7.c 
> --target_board='unix{-m64\ -march=cascadelake}'"
>
> (Please do not reply to this email, for question about this report, contact 
> me at skpgkp2 at gmail dot com)


Re: [Patch] OpenMP/Fortran: Fixes for {use,is}_device_ptr [PR98476]

2021-01-19 Thread Jakub Jelinek via Gcc-patches
On Mon, Jan 18, 2021 at 05:56:21PM +0100, Tobias Burnus wrote:
> gcc/testsuite/ChangeLog:
> 
>   PR fortran/98476
>   * gfortran.dg/gomp/map-3.f90: Update expected scan-dump-tree.
>   * gfortran.dg/gomp/is_device_ptr-1.f90: New test.
>   * gfortran.dg/gomp/use_device_ptr-2.f90: New test.

I'm getting
/usr/src/gcc/gcc/testsuite/gfortran.dg/gomp/is_device_ptr-2.f90:11:36: Error: 
Non-dummy object 'dd' in IS_DEVICE_PTR clause at (1)
compiler exited with status 1
FAIL: gfortran.dg/gomp/is_device_ptr-2.f90   -O   (test for errors, line 11)
FAIL: gfortran.dg/gomp/is_device_ptr-2.f90   -O  (test for excess errors)
Excess errors:
/usr/src/gcc/gcc/testsuite/gfortran.dg/gomp/is_device_ptr-2.f90:11:36: Error: 
Non-dummy object 'dd' in IS_DEVICE_PTR clause at (1)
failure everywhere, the test expects cc instead of dd to be printed.
Do we want it to diagnose both, or should the regexp accept any of them?

Jakub



Re: [r11-6759 Regression] FAIL: gcc.dg/debug/dwarf2/pr41445-7.c scan-assembler DW_TAG_variable[^\\r\\n]*[\\r\\n]+[^\\r\\n]*"varj[^\\r\\n]*DW_AT_name([^\\r\\n]*[\\r\\n]+[^\\r\\n]*DW_AT_)*[^\\r\\n]*[\\r

2021-01-19 Thread Jeff Law via Gcc-patches



On 1/19/21 10:46 AM, David Edelsohn wrote:
> Thanks for fixing this, Jeff.
No worries.

>
> I didn't realize that the testcase was testing the explicit source
> code line number hard coded the expected DWARF output.
Me neither.  It took a few minutes to see what was going on.  I kept
thinking something  must have changed in the compiler.

>
> It would have been nice if the testcase included a comment about the
> purpose and to alert people that the expected output needs to change
> if lines are added to the file.
Agreed.

jeff



Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]

2021-01-19 Thread Richard Sandiford via Gcc-patches
Jan Hubicka  writes:
>> On Mon, 18 Jan 2021, Richard Sandiford wrote:
>> 
>> > Jan Hubicka  writes:
>> > >> >> 
>> > >> >> Well, in tree-ssa code we do assume these to be either disjoint 
>> > >> >> objects
>> > >> >> or equal (in decl_refs_may_alias_p that continues in case
>> > >> >> compare_base_decls is -1).  I am not sure if we win much by threating
>> > >> >> them differently on RTL level. I would preffer staying consistent 
>> > >> >> here.
>> > >> 
>> > >> Yeah, I see your point.  My concern here was that the fallback case
>> > >> applies to SYMBOL_REFs without decls, which might not have been visible
>> > >> at the tree-ssa level.  E.g. they might be ABI-defined symbols that have
>> > >> no known relation to source-level constructs.
>> > >> 
>> > >> E.g. the small-data base symbol _gp on MIPS points at a fixed offset
>> > >> from the start of the small-data area (0x7ff0 IIRC).  If the target
>> > >> generated rtl code that used _gp directly, we could wrongly assume
>> > >> that _gp+X can't alias BASE+Y when X != Y, even though the real test
>> > >> for small-data BASEs would be whether X + 0x7ff0 != Y.
>> > >> 
>> > >> I don't think that could occur in tree-ssa.  No valid C code would
>> > >> be able to refer directly to _gp in this way.
>> > >> 
>> > >> On the other hand, I don't have a specific example of where this does
>> > >> go wrong, it's just a feeling that it might.  I can drop it if you
>> > >> think that's better.
>> > >
>> > > I would lean towards not disabling optimization when we have no good
>> > > reason for that - we already did it bit too many times in aliasing code
>> > > and it is hard to figure out what optimizations are missed purposefully
>> > > and what are missed just as omission.
>> > >
>> > > We already comitted to a very conservative assumption that every
>> > > external symbol can be alias of another. I think we should have
>> > > originally required units that reffers to same memory location via
>> > > different symbols to declare it explicitly (i.e. make external alias to
>> > > external symbol), but we do not even allow external aliases (symtab
>> > > supports that though) and also it may depend on use of the module what
>> > > symbols are aliased.
>> > >
>> > > We also decided to disable TBAA for direct accesses to decls to allow
>> > > type punning using unions.
>> > >
>> > > This keeps the offset+range check to be only means of disambiguation.
>> > > While for modern programs global arrays are not common, for Fortran
>> > > stuff they are, so I would preffer to not cripple them even more.
>> > > (I am not sure how often the arrays are external though)
>> > 
>> > OK, the version below drops the new -2 return value and tries to
>> > clarify the comments in compare_base_symbol_refs.
>> > 
>> > Lightly tested on aarch64-linux-gnu so far.  Does it look OK if
>> > full tests pass?
>> 
>> OK from my side.
>
> OK too, thanks!
> Honza

Thanks, pushed to master after testing on aarch64-linux-gnu,
aarch64_be-elf and x86_64-linux-gnu.  I don't think it's suitable
for backports.

Richard


[PATCH v2] fwprop: Allow (subreg (mem)) simplifications

2021-01-19 Thread Ilya Leoshkevich via Gcc-patches
v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563800.html

v1 -> v2: Allow (mem) -> (subreg) propagation only for single uses.

Boostrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
and s390x-redhat-linux.  Ok for master?



Suppose we have:

(set (reg/v:TF 63) (mem/c:TF (reg/v:DI 62)))
(set (reg:FPRX2 66) (subreg:FPRX2 (reg/v:TF 63) 0))

It is clearly profitable to propagate the first insn into the second
one and get:

(set (reg:FPRX2 66) (mem/c:FPRX2 (reg/v:DI 62)))

fwprop actually manages to perform this, but doesn't think the result is
worth it, which results in unnecessary store/load sequences on s390.
Improve the situation by classifying SUBREG -> MEM changes as
profitable.

gcc/ChangeLog:

2021-01-15  Ilya Leoshkevich  

* fwprop.c (fwprop_propagation::classify_result): Allow
(subreg (mem)) simplifications.
---
 gcc/fwprop.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index eff8f7cc141..02d3d507cbc 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -176,7 +176,7 @@ namespace
 static const uint16_t CONSTANT = FIRST_SPARE_RESULT << 1;
 static const uint16_t PROFITABLE = FIRST_SPARE_RESULT << 2;
 
-fwprop_propagation (rtx_insn *, rtx, rtx);
+fwprop_propagation (rtx_insn *, insn_info *, rtx, rtx);
 
 bool changed_mem_p () const { return result_flags & CHANGED_MEM; }
 bool folded_to_constants_p () const;
@@ -185,13 +185,18 @@ namespace
 bool check_mem (int, rtx) final override;
 void note_simplification (int, uint16_t, rtx, rtx) final override;
 uint16_t classify_result (rtx, rtx);
+
+  private:
+const bool single_use_p;
   };
 }
 
 /* Prepare to replace FROM with TO in INSN.  */
 
-fwprop_propagation::fwprop_propagation (rtx_insn *insn, rtx from, rtx to)
-  : insn_propagation (insn, from, to)
+fwprop_propagation::fwprop_propagation (rtx_insn *insn, insn_info *def_insn,
+   rtx from, rtx to)
+: insn_propagation (insn, from, to),
+  single_use_p (def_insn->num_uses () == 1)
 {
   should_check_mems = true;
   should_note_simplifications = true;
@@ -262,6 +267,13 @@ fwprop_propagation::classify_result (rtx old_rtx, rtx 
new_rtx)
   && GET_MODE (new_rtx) == GET_MODE_INNER (GET_MODE (from)))
 return PROFITABLE;
 
+  /* Allow (subreg (mem)) -> (mem) simplifications.  Do not allow propagation
+ of (mem)s into multiple uses, since those are not profitable, as well as
+ creating new (mem/v)s, since DCE will not remove the old ones.  */
+  if (single_use_p && SUBREG_P (old_rtx) && MEM_P (new_rtx)
+  && !MEM_VOLATILE_P (new_rtx))
+return PROFITABLE;
+
   return 0;
 }
 
@@ -363,7 +375,7 @@ try_fwprop_subst_note (insn_info *use_insn, insn_info 
*def_insn,
   rtx_insn *use_rtl = use_insn->rtl ();
 
   insn_change_watermark watermark;
-  fwprop_propagation prop (use_rtl, dest, src);
+  fwprop_propagation prop (use_rtl, def_insn, dest, src);
   if (!prop.apply_to_rvalue (&XEXP (note, 0)))
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -426,7 +438,7 @@ try_fwprop_subst_pattern (obstack_watermark &attempt, 
insn_change &use_change,
   rtx_insn *use_rtl = use_insn->rtl ();
 
   insn_change_watermark watermark;
-  fwprop_propagation prop (use_rtl, dest, src);
+  fwprop_propagation prop (use_rtl, def_insn, dest, src);
   if (!prop.apply_to_pattern (loc))
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
-- 
2.26.2



[PATCH] improve warning suppression for inlined functions (PR 98465, 98512)

2021-01-19 Thread Martin Sebor via Gcc-patches

std::string tends to trigger a class of false positive out of bounds
access warnings for code GCC cannot prove is unreachable because of
missing aliasing constrains, and that ends up expanded inline into
user code.  Simply inserting the contents of a constant char array
does that.  In GCC 10 these false positives are suppressed due to
-Wno-system-headers, but in GCC 11, to help detect calls rendered
invalid by user code passing in either incorrect or insufficiently
constrained arguments, -Wno-system-header no longer has this effect
on invalid access warnings.

To solve the problem without at least partially reverting the change
and going back to the GCC 10 way of things for the affected subset
of calls (just memcpy and memmove), the attached patch enhances
the #pragma GCC diagnostic machinery to consider not just a single
location for inlined code but all locations at which an expression
and its callers are inlined all the way up the stack.  This gives
each author of a function involved in inlining the ability to
control a warning issued for the code, not just the user into whose
code all the calls end up inlined.  To resolve PR 98465, it lets us
suppress the false positives selectively in std::string rather
than across the board in GCC.

The solution is to provide a new pair of overloads for warning
functions that, instead of taking a single location argument, take
a tree node from which the location(s) are determined.  The tree
argument is indirect because the diagnostic machinery doesn't (and
cannot without more intrusive changes) at the moment depend on
the various tree definitions.  A nice feature of these overloads
is that they do away with the need for the %K directive (and in
the future also %G, with another enhancement to accept a gimple*
argument).

This patch depends on the fix for PR 98664 (already approved but
not yet checked in).  I've tested it on x86_64-linux.

To avoid fallout I tried to keep the changes to a minimum, and
so the design isn't as robust as I'd like it ultimately to be.
I plan to enhance it in stage 1.

Martin
PR middle-end/98465 - Bogus -Wstringop-overread in std::string
PR middle-end/98512 - “#pragma GCC diagnostic ignored” ineffective in conjunction with alias attribute

gcc/ChangeLog:

	PR middle-end/98465
	PR middle-end/98512
	* builtins.c (class diag_inlining_context): New class.
	(maybe_warn_for_bound): Adjust signature.  Use diag_inlining_context.
	(warn_for_access): Same.
	(check_access): Remove calls to tree_inlined_location.
	(expand_builtin_strncmp): Remove argument from calls to
	maybe_warn_for_bound.
	(warn_dealloc_offset): Adjust signature.  Use diag_inlining_context.
	(maybe_emit_free_warning): Remove calls to tree_inlined_location.
	* diagnostic-core.h (warning, warning_n): New overloads.
	* diagnostic-metadata.h (class diagnostic_metadata::location_context):
	New.
	(struct diagnostic_info): Declare.
	* diagnostic.c (location_context::locations): Define.
	(update_effective_level_from_pragmas): Use location_context to test
	inlinined locations.
	(diagnostic_report_diagnostic): Set location context.
	(warning, warning_n): Define new overloads.
	* diagnostic.h (diagnostic_inhibit_notes):

gcc/cp/ChangeLog:

	* mapper-client.cc: Include headers needed by others.

libstdc++-v3/ChangeLog:

	PR middle-end/98465
	* include/bits/basic_string.tcc (_M_replace): Suppress false positive
	warnings.
	* testsuite/18_support/new_delete_placement.cc: Suppress valid warnings.
	* testsuite/20_util/monotonic_buffer_resource/allocate.cc: Same.
	* testsuite/20_util/unsynchronized_pool_resource/allocate.cc: Same.

gcc/testsuite/ChangeLog:

	PR middle-end/98512	
	* gcc.dg/pragma-diag-9.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index c1115a32d91..68f1ae042d8 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -39,7 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "optabs.h"
 #include "emit-rtl.h"
 #include "recog.h"
-#include "diagnostic-core.h"
+#include "diagnostic.h"
 #include "alias.h"
 #include "fold-const.h"
 #include "fold-const-call.h"
@@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-outof-ssa.h"
 #include "attr-fnspec.h"
 #include "demangle.h"
+#include "tree-pretty-print.h"
 
 struct target_builtins default_target_builtins;
 #if SWITCHABLE_TARGET
@@ -749,6 +750,93 @@ is_builtin_name (const char *name)
   return false;
 }
 
+/* Class to override the base location context for an expression EXPR.  */
+
+class diag_inlining_context: public diagnostic_metadata::location_context
+{
+ public:
+  diag_inlining_context (tree expr): m_expr (expr), m_ao (), m_loc () { }
+
+  virtual void locations (vec &locs, diagnostic_info *di)
+  {
+set_locations (&locs, di);
+  }
+
+  virtual void set_location (diagnostic_info *);
+
+ private:
+  void set_locations (vec *, diagnostic_info *);
+
+  /* The expression for which a diagnostic is being issued.  */
+  tree m_expr;
+  /* The "abstract origin" of the diagnosed ex

c++: Fix null this pointer [PR 98624]

2021-01-19 Thread Nathan Sidwell


There's no need for this function to have an object, so make it
static and avoid UB.

gcc/cp/
* module.cc (trees_out::write_location): Make static.

--
Nathan Sidwell
diff --git i/gcc/cp/module.cc w/gcc/cp/module.cc
index 1fd0bcfe3eb..3b224b616c1 100644
--- i/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -3727,7 +3727,7 @@ class GTY((chain_next ("%h.parent"), for_user)) module_state {
   static cpp_macro *deferred_macro (cpp_reader *, location_t, cpp_hashnode *);
 
  public:
-  void write_location (bytes_out &, location_t);
+  static void write_location (bytes_out &, location_t);
   location_t read_location (bytes_in &) const;
 
  public:


driver: do not check input file existence here [PR 98452]

2021-01-19 Thread Nathan Sidwell

Joseph,
I was relying on this patch on the modules branch, but didn't realize 
the implications when merging and thought it was just a cleanup.  I'm 
not sure why the driver wants to check here, rather than leave it to the 
compiler.  Seems optimizing for failure? The only difference I can think 
is that the diagnostic might mention the driver name, rather than say 
(cc1plus), but that's a different problem that I've also reported.


The existing code seems confused anyway, we look for NAME, but then 
report NAME + 1 if name's initial char indicates a response file '@'. 
IMHO if that's some subtle way of giving an error for an (already) 
failed missing response file, it seems odd.


The failure for modules concerns -x c++-system-header (or 
c++-user-header), where the compiler will search for the file on the 
appropriate include path.


booted & tested on x86_64-linux, ok?

gcc/
* gcc.c (process_command): Don't check OPT_SPECIAL_input_file
existence here.

--
Nathan Sidwell
diff --git i/gcc/gcc.c w/gcc/gcc.c
index 7dccfadfef2..aa5774af7e7 100644
--- i/gcc/gcc.c
+++ w/gcc/gcc.c
@@ -4811,44 +4811,12 @@ process_command (unsigned int decoded_options_count,
   if (decoded_options[j].opt_index == OPT_SPECIAL_input_file)
 	{
 	  const char *arg = decoded_options[j].arg;
-  const char *p = strrchr (arg, '@');
-  char *fname;
-	  long offset;
-	  int consumed;
+
 #ifdef HAVE_TARGET_OBJECT_SUFFIX
 	  arg = convert_filename (arg, 0, access (arg, F_OK));
 #endif
-	  /* For LTO static archive support we handle input file
-	 specifications that are composed of a filename and
-	 an offset like FNAME@OFFSET.  */
-	  if (p
-	  && p != arg
-	  && sscanf (p, "@%li%n", &offset, &consumed) >= 1
-	  && strlen (p) == (unsigned int)consumed)
-	{
-  fname = (char *)xmalloc (p - arg + 1);
-  memcpy (fname, arg, p - arg);
-  fname[p - arg] = '\0';
-	  /* Only accept non-stdin and existing FNAME parts, otherwise
-		 try with the full name.  */
-	  if (strcmp (fname, "-") == 0 || access (fname, F_OK) < 0)
-		{
-		  free (fname);
-		  fname = xstrdup (arg);
-		}
-	}
-	  else
-	fname = xstrdup (arg);
-
-  if (strcmp (fname, "-") != 0 && access (fname, F_OK) < 0)
-	{
-	  bool resp = fname[0] == '@' && access (fname + 1, F_OK) < 0;
-	  error ("%s: %m", fname + resp);
-	}
-  else
-	add_infile (arg, spec_lang);
+	  add_infile (arg, spec_lang);
 
-  free (fname);
 	  continue;
 	}
 


Re: [PATCH] c++: Always check access during late-parsing of members [PR58993]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/12/21 5:26 PM, Patrick Palka wrote:

This patch removes a vestigial use of dk_no_check from
cp_parser_late_parsing_for_member, which ideally should have been
removed as part of the PR41437 patch that improved access checking
inside templates.



This allows us to correctly reject f1 and f2 below in
the testcase access34.C below (whereas before we'd only reject f3).

Additional testing revealed an access issue when late-parsing a hidden
friend within a class template.  In the testcase friend68.C below, we're
tripping over the checking assert from friend_accessible_p(f, S::j, S, S)
during lookup of j in x.j (for which type_dependent_object_expression_p
returns false, which is why we're doing the lookup at parse time).  The
reason for the assert failure is that DECL_FRIENDLIST(S) contains f but
DECL_BEFRIENDING_CLASSES(f) is empty, and so friend_accessible_p (which
looks at DECL_BEFRIENDING_CLASSES) wants to return false, but is_friend
(which looks at DECL_FRIENDLIST) returns true.  For sake of symmetry one
would probably hope that DECL_BEFRIENDING_CLASSES(f) contains S, but
add_friend avoids updating DECL_BEFRIENDING_CLASSES when the class type
(S in this case) is dependent, for some reason.


Perhaps because it wasn't useful before the 41437 patch.  I wonder if 
this will cause other problems, but this patch seems like a good change 
to parallel the member case, so let's go ahead with it.  OK.



This patch works around this issue by making friend_accessible_p
consider the DECL_FRIEND_CONTEXT of SCOPE.  Thus we sidestep the
DECL_BEFRIENDING_CLASSES / DECL_FRIENDLIST asymmetry issue while
correctly validating the x.j access at parse time.

A earlier version of this patch called friend_accessible_p instead of
protected_accessible_p in the DECL_FRIEND_CONTEXT hunk below, but this
had the side effect of making us accept the ill-formed testcase
friend69.C below (ill-formed because a hidden friend is not actually a
class member, so g doesn't have access to B's members even though A is a
friend of B).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look like
the right approach?

gcc/cp/ChangeLog:

PR c++/41437
PR c++/58993
* search.c (friend_accessible_p): If scope is a hidden friend
defined inside a dependent class, consider access from the
defining class.
* parser.c (cp_parser_late_parsing_for_member): Don't push a
dk_no_check access state.

gcc/testsuite/ChangeLog:

PR c++/41437
PR c++/58993
* g++.dg/opt/pr87974.C: Adjust.
* g++.dg/template/access34.C: New test.
* g++.dg/template/friend68a.C: New test.
* g++.dg/template/friend68b.C: New test.
* g++.dg/template/friend69.C: New test.
---
  gcc/cp/parser.c  |  7 --
  gcc/cp/search.c  |  8 +++
  gcc/testsuite/g++.dg/opt/pr87974.C   |  1 +
  gcc/testsuite/g++.dg/template/access34.C | 29 
  gcc/testsuite/g++.dg/template/friend68.C | 13 +++
  gcc/testsuite/g++.dg/template/friend69.C | 18 +++
  6 files changed, 69 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/access34.C
  create mode 100644 gcc/testsuite/g++.dg/template/friend68.C
  create mode 100644 gcc/testsuite/g++.dg/template/friend69.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index b95151bf90d..c26e16beb70 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -30825,10 +30825,6 @@ cp_parser_late_parsing_for_member (cp_parser* parser, 
tree member_function)
start_preparsed_function (member_function, NULL_TREE,
SF_PRE_PARSED | SF_INCLASS_INLINE);
  
-  /* Don't do access checking if it is a templated function.  */

-  if (processing_template_decl)
-   push_deferring_access_checks (dk_no_check);
-
/* #pragma omp declare reduction needs special parsing.  */
if (DECL_OMP_DECLARE_REDUCTION_P (member_function))
{
@@ -30842,9 +30838,6 @@ cp_parser_late_parsing_for_member (cp_parser* parser, 
tree member_function)
cp_parser_function_definition_after_declarator (parser,
/*inline_p=*/true);
  
-  if (processing_template_decl)

-   pop_deferring_access_checks ();
-
/* Leave the scope of the containing function.  */
if (function_scope)
pop_function_context ();
diff --git a/gcc/cp/search.c b/gcc/cp/search.c
index 1a9dba451c7..dd3773da4f7 100644
--- a/gcc/cp/search.c
+++ b/gcc/cp/search.c
@@ -698,6 +698,14 @@ friend_accessible_p (tree scope, tree decl, tree type, 
tree otype)
if (DECL_CLASS_SCOPE_P (scope)
  && friend_accessible_p (DECL_CONTEXT (scope), decl, type, otype))
return 1;
+  /* Perhaps SCOPE is a friend function defined inside a class from which
+DECL is accessible.  Checking this is necessary only when the class
+is dependent, for 

Re: [PATCH] c++: ICE when late parsing noexcept/NSDMI [PR98333]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/12/21 9:13 PM, Marek Polacek wrote:

Since certain members of a class are a complete-class context
[class.mem.general]p7, we delay their parsing untile the whole class has
been parsed.  For instance, NSDMIs and noexcept-specifiers.  The order
in which we perform this delayed parsing matters; we were first parsing
NSDMIs and only they did we parse noexcept-specifiers.   That turns out
to be wrong: since NSDMIs may use noexcept-specifiers, we must process
noexcept-specifiers first.  Otherwise we'll ICE in code that doesn't
expect to see DEFERRED_PARSE.

This doesn't just shift the problem, noexcept-specifiers can use members
with a NSDMI just fine, and I've also tested a similar test with this
member function:

   bool f() { return __has_nothrow_constructor (S); }

and that compiled fine too.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?


OK.


gcc/cp/ChangeLog:

PR c++/98333
* parser.c (cp_parser_class_specifier_1): Perform late-parsing
of NSDMIs before late-parsing of noexcept-specifiers.

gcc/testsuite/ChangeLog:

PR c++/98333
* g++.dg/cpp0x/noexcept62.C: New test.
---
  gcc/cp/parser.c | 44 -
  gcc/testsuite/g++.dg/cpp0x/noexcept62.C | 10 ++
  2 files changed, 31 insertions(+), 23 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept62.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c713852fe93..92ea4a23d17 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -25008,31 +25008,10 @@ cp_parser_class_specifier_1 (cp_parser* parser)
  maybe_end_member_template_processing ();
}
vec_safe_truncate (unparsed_funs_with_default_args, 0);
-  /* Now parse any NSDMIs.  */
-  save_ccp = current_class_ptr;
-  save_ccr = current_class_ref;
-  FOR_EACH_VEC_SAFE_ELT (unparsed_nsdmis, ix, decl)
-   {
- if (class_type != DECL_CONTEXT (decl))
-   {
- if (pushed_scope)
-   pop_scope (pushed_scope);
- class_type = DECL_CONTEXT (decl);
- pushed_scope = push_scope (class_type);
-   }
- inject_this_parameter (class_type, TYPE_UNQUALIFIED);
- cp_parser_late_parsing_nsdmi (parser, decl);
-   }
-  vec_safe_truncate (unparsed_nsdmis, 0);
-  current_class_ptr = save_ccp;
-  current_class_ref = save_ccr;
-  if (pushed_scope)
-   pop_scope (pushed_scope);
  
/* If there are noexcept-specifiers that have not yet been processed,

-take care of them now.  */
-  class_type = NULL_TREE;
-  pushed_scope = NULL_TREE;
+take care of them now.  Do this before processing NSDMIs as they
+may depend on noexcept-specifiers already having been processed.  */
FOR_EACH_VEC_SAFE_ELT (unparsed_noexcepts, ix, decl)
{
  tree ctx = DECL_CONTEXT (decl);
@@ -25084,6 +25063,25 @@ cp_parser_class_specifier_1 (cp_parser* parser)
  maybe_end_member_template_processing ();
}
vec_safe_truncate (unparsed_noexcepts, 0);
+
+  /* Now parse any NSDMIs.  */
+  save_ccp = current_class_ptr;
+  save_ccr = current_class_ref;
+  FOR_EACH_VEC_SAFE_ELT (unparsed_nsdmis, ix, decl)
+   {
+ if (class_type != DECL_CONTEXT (decl))
+   {
+ if (pushed_scope)
+   pop_scope (pushed_scope);
+ class_type = DECL_CONTEXT (decl);
+ pushed_scope = push_scope (class_type);
+   }
+ inject_this_parameter (class_type, TYPE_UNQUALIFIED);
+ cp_parser_late_parsing_nsdmi (parser, decl);
+   }
+  vec_safe_truncate (unparsed_nsdmis, 0);
+  current_class_ptr = save_ccp;
+  current_class_ref = save_ccr;
if (pushed_scope)
pop_scope (pushed_scope);
  
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept62.C b/gcc/testsuite/g++.dg/cpp0x/noexcept62.C

new file mode 100644
index 000..53606c79142
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept62.C
@@ -0,0 +1,10 @@
+// PR c++/98333
+// { dg-do compile { target c++11 } }
+
+struct T {
+  template 
+  struct S {
+S () noexcept (N) {}
+  };
+  int a = __has_nothrow_constructor (S);
+};

base-commit: cfaaa6a1ca744c1a93fa08a3e7ab2a821383cac1





Re: [PATCH] c++: ICE with delayed noexcept and attribute used [PR97966]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/12/21 9:13 PM, Marek Polacek wrote:

Another ICE with delayed noexcept parsing, but a bit gnarlier.

A function definition marked with __attribute__((used)) ought to be
emitted even when it is not referenced in a TU.  For a member function
template marked with __attribute__((used)) this means that it will
be instantiated: in instantiate_class_template_1 we have

11971   /* Instantiate members marked with attribute used.  */
11972   if (r != error_mark_node && DECL_PRESERVE_P (r))
11973 mark_used (r);

It is not so surprising that this doesn't work well with delayed
noexcept parsing: when we're processing the function template we delay
the parsing, so the member "foo" is found, but then when we're
instantiating it, "foo" hasn't yet been seen, which creates a
discrepancy and a crash ensues.  "foo" hasn't yet been seen because
instantiate_class_template_1 just loops over the class members and
instantiates right away.


That seems like the bug; we shouldn't instantiate any members until 
we're done instantiating the class.


Jason



Re: [PATCH] c++: Fix ICE with non-constant satisfaction [PR98644]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/13/21 12:05 PM, Patrick Palka wrote:

In the below testcase, the expression of the atomic constraint after
substitution is (int *) NON_LVALUE_EXPR <1> != 0B which is not a C++
constant expression, but its TREE_CONSTANT flag is set (from build2),
so satisfy_atom fails to notice that it's non-constant (and we end
up tripping over the assert in satisfaction_value).

Since TREE_CONSTANT doesn't necessarily correspond to C++ constantness,
this patch makes satisfy_atom instead check is_rvalue_constant_expression.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/10?

gcc/cp/ChangeLog:

PR c++/98644
* constraint.cc (satisfy_atom): Check is_rvalue_constant_expression
instead of TREE_CONSTANT.

gcc/testsuite/ChangeLog:

PR c++/98644
* g++.dg/cpp2a/concepts-pr98644.C: New test.
---
  gcc/cp/constraint.cc  | 2 +-
  gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C | 7 +++
  2 files changed, 8 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 9049d087859..f99a25dc8a4 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2969,7 +2969,7 @@ satisfy_atom (tree t, tree args, sat_info info)
  {
result = maybe_constant_value (result, NULL_TREE,
 /*manifestly_const_eval=*/true);
-  if (!TREE_CONSTANT (result))


This should be sufficient.  If the result isn't constant, 
maybe_constant_value shouldn't return it with TREE_CONSTANT set.  See


  /* This isn't actually constant, so unset TREE_CONSTANT. 


in cxx_eval_outermost_constant_expr.

Jason



Re: [PATCH] c++: Crash when deducing template arguments [PR98659]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/13/21 1:38 PM, Marek Polacek wrote:

maybe_instantiate_noexcept doesn't expect to see error_mark_node, so
the new callsite I introduced in r11-6476 needs to be properly guarded.


I'd rather fix maybe_instantiate_noexcept to deal with error_mark_node.


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/98659
* pt.c (resolve_overloaded_unification): Don't call
maybe_instantiate_noexcept with error_mark_node.

gcc/testsuite/ChangeLog:

PR c++/98659
* g++.dg/template/deduce8.C: New test.
---
  gcc/cp/pt.c |  2 +-
  gcc/testsuite/g++.dg/template/deduce8.C | 21 +
  2 files changed, 22 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/template/deduce8.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 100c35f053c..83ecb0a2c3a 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -22382,7 +22382,7 @@ resolve_overloaded_unification (tree tparms,
  --function_depth;
}
  
-	  if (flag_noexcept_type)

+ if (flag_noexcept_type && fn != error_mark_node)
maybe_instantiate_noexcept (fn, tf_none);
  
  	  elem = TREE_TYPE (fn);

diff --git a/gcc/testsuite/g++.dg/template/deduce8.C 
b/gcc/testsuite/g++.dg/template/deduce8.C
new file mode 100644
index 000..430be426689
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/deduce8.C
@@ -0,0 +1,21 @@
+// PR c++/98659
+// { dg-do compile }
+
+template  struct enable_if;
+struct function {
+  template  void operator=(_F);
+};
+struct map {
+  function operator[](int);
+};
+enum { E };
+template  void foo ();
+template 
+typename enable_if::type foo ();
+
+void
+bar ()
+{
+  map m;
+  m[E] = foo;
+}

base-commit: 7d7ef413ef1b696dec2710ae0acc058bdc832686





Re: [PATCH v2] c++: ICE with USING_DECL redeclaration [PR98687]

2021-01-19 Thread Marek Polacek via Gcc-patches
On Mon, Jan 18, 2021 at 05:18:46PM -0500, Jason Merrill wrote:
> On 1/15/21 12:26 AM, Marek Polacek wrote:
> > My recent patch that introduced push_using_decl_bindings didn't
> > handle USING_DECL redeclaration, therefore things broke.  This
> > patch amends that.  Note that I don't know if the other parts of
> > finish_nonmember_using_decl are needed (e.g. the binding->type
> > setting) -- I couldn't trigger it by any of my hand-made testcases.
> 
> I'd expect it to be exercised by something along the lines of
> 
> struct A { };
> 
> void f()
> {
>   int A;
>   using ::A;
>   struct A a;
> }

Hmm, I already had a test for the struct stat hack, but I've added this one
to using64.C, thanks.
 
> Let's factor the code out of finish_nonmember_using_decl rather than copy
> it.

Done here.  A small complication was that name_lookup is local to
name-lookup.c so I had to add an overload to handle this.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
My recent patch that introduced push_using_decl_bindings didn't
handle USING_DECL redeclaration, therefore things broke.  This patch
amends that by breaking out a part of finish_nonmember_using_decl
out to a separate function, push_using_decl_bindings, and calling it.
It needs an overload, because name_lookup is only available inside
of name-lookup.c.

gcc/cp/ChangeLog:

PR c++/98687
* name-lookup.c (push_using_decl_bindings): New, broken out of...
(finish_nonmember_using_decl): ...here.
* name-lookup.h (push_using_decl_bindings): Update declaration.
* pt.c (tsubst_expr): Update the call to push_using_decl_bindings.

gcc/testsuite/ChangeLog:

PR c++/98687
* g++.dg/lookup/using64.C: New test.
* g++.dg/lookup/using65.C: New test.
---
 gcc/cp/name-lookup.c  | 103 ++
 gcc/cp/name-lookup.h  |   2 +-
 gcc/cp/pt.c   |   3 +-
 gcc/testsuite/g++.dg/lookup/using64.C |  69 +
 gcc/testsuite/g++.dg/lookup/using65.C |  17 +
 5 files changed, 145 insertions(+), 49 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lookup/using64.C
 create mode 100644 gcc/testsuite/g++.dg/lookup/using65.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index b4b6c0b81b5..843e5f305c0 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -6279,6 +6279,61 @@ pushdecl_namespace_level (tree x, bool hiding)
   return t;
 }
 
+/* Wrapper around push_local_binding to push the bindings for
+   a non-member USING_DECL with NAME and VALUE.  LOOKUP, if non-null,
+   is the result of name lookup during template parsing.  */
+
+static void
+push_using_decl_bindings (name_lookup *lookup, tree name, tree value)
+{
+  tree type = NULL_TREE;
+
+  cxx_binding *binding = find_local_binding (current_binding_level, name);
+  if (binding)
+{
+  value = binding->value;
+  type = binding->type;
+}
+
+  /* DR 36 questions why using-decls at function scope may not be
+ duplicates.  Disallow it, as C++11 claimed and PR 20420
+ implemented.  */
+  if (lookup)
+do_nonmember_using_decl (*lookup, true, true, &value, &type);
+
+  if (!value)
+;
+  else if (binding && value == binding->value)
+/* Redeclaration of this USING_DECL.  */;
+  else if (binding && binding->value && TREE_CODE (value) == OVERLOAD)
+{
+  /* We already have this binding, so replace it.  */
+  update_local_overload (IDENTIFIER_BINDING (name), value);
+  IDENTIFIER_BINDING (name)->value = value;
+}
+  else
+/* Install the new binding.  */
+push_local_binding (name, value, /*using=*/true);
+
+  if (!type)
+;
+  else if (binding && type == binding->type)
+;
+  else
+{
+  push_local_binding (name, type, /*using=*/true);
+  set_identifier_type_value (name, type);
+}
+}
+
+/* Overload for push_using_decl_bindings that doesn't take a name_lookup.  */
+
+void
+push_using_decl_bindings (tree name, tree value)
+{
+  push_using_decl_bindings (nullptr, name, value);
+}
+
 /* Process a using declaration in non-class scope.  */
 
 void
@@ -6395,43 +6450,7 @@ finish_nonmember_using_decl (tree scope, tree name)
   else
 {
   add_decl_expr (using_decl);
-
-  cxx_binding *binding = find_local_binding (current_binding_level, name);
-  tree value = NULL;
-  tree type = NULL;
-  if (binding)
-   {
- value = binding->value;
- type = binding->type;
-   }
-
-  /* DR 36 questions why using-decls at function scope may not be
-duplicates.  Disallow it, as C++11 claimed and PR 20420
-implemented.  */
-  do_nonmember_using_decl (lookup, true, true, &value, &type);
-
-  if (!value)
-   ;
-  else if (binding && value == binding->value)
-   ;
-  else if (binding && binding->value && TREE_CODE (value) == OVERLOAD)
-   {
- update_local_overload (IDENTIFIER_BINDING (name), value);
- IDENTIFIER_

Re: [PATCH] c++: ICE when mangling operator name [PR98545]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/13/21 6:39 PM, Marek Polacek wrote:

r11-6301 added some asserts in mangle.c, and now we trip over one of
them.  In particular, it's the one asserting that we didn't get
IDENTIFIER_ANY_OP_P when mangling an expression with a dependent name.

As this testcase shows, it's possible to get that, so turn the assert
into an if and write "on".  That changes the mangling in the following
way:

With this patch:

$ c++filt _ZN1i1hIJ1adS1_EEEDTcldtdefpTonclspcvT__EEEDpS2_
decltype (((*this).(operator()))((a)(), (double)(), (a)())) i::h(a, double, a)

G++10:
$ c++filt _ZN1i1hIJ1adS1_EEEDTcldtdefpTclspcvT__EEEDpS2_
decltype (((*this).(operator()))((a)(), (double)(), (a)())) i::h(a, double, a)

clang++/icc:
$ c++filt _ZN1i1hIJ1adS1_EEEDTclonclspcvT__EEEDpS2_
decltype ((operator())((a)(), (double)(), (a)())) i::h(a, double, 
a)

I'm not sure why we differ in the "(*this)." part


Is there a PR for that?


but at least the
suffix "onclspcvT__EEEDpS2_" is the same for all three compilers.  So
I hope the following fix makes sense.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/98545
* mangle.c (write_expression): When the expression is a dependent name
and an operator name, write "on" before writing its name.

gcc/testsuite/ChangeLog:

PR c++/98545
* g++.dg/abi/mangle76.C: New test.
---
  gcc/cp/mangle.c |  3 ++-
  gcc/testsuite/g++.dg/abi/mangle76.C | 39 +
  2 files changed, 41 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/abi/mangle76.C

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 11eb8962d28..bb3c4b76d33 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -3349,7 +3349,8 @@ write_expression (tree expr)
else if (dependent_name (expr))
  {
tree name = dependent_name (expr);
-  gcc_assert (!IDENTIFIER_ANY_OP_P (name));
+  if (IDENTIFIER_ANY_OP_P (name))
+   write_string ("on");


Any mangling change needs to handle different -fabi-versions; see the 
similar code in write_member_name.


And why doesn't this go through write_member_name?


write_unqualified_id (name);
  }
else
diff --git a/gcc/testsuite/g++.dg/abi/mangle76.C 
b/gcc/testsuite/g++.dg/abi/mangle76.C
new file mode 100644
index 000..0c2964cbecb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/mangle76.C
@@ -0,0 +1,39 @@
+// PR c++/98545
+// { dg-do compile { target c++11 } }
+
+class a {
+public:
+  a();
+  template  a(b);
+};
+template  using c = a;
+class f {
+protected:
+  template  void operator()(d, double, e);
+};
+class i : f {
+public:
+  template 
+  [[gnu::used]] auto h(g...) -> decltype(operator()(g()...)) {}
+// { dg-final { scan-assembler 
"_ZN1i1hIJ1adS1_EEEDTcldtdefpTonclspcvT__EEEDpS2_" } }
+};
+template  class C {
+public:
+  template  C(j);
+  i k() const;
+  int operator()() {
+int l = 10;
+c<> m, n;
+operator()(m, l, n);
+return 0;
+  }
+  int operator()(c<> &, c<> const &, c<> const &) const;
+  template  void k(d m, double gamma, e o) const {
+k().h(m, gamma, o);
+  }
+};
+template  int C::operator()(c<> &, c<> const &, c<> const &) const 
{
+  [&](c<> m, double gamma, c<> o) { k(m, gamma, o); };
+  return 0;
+}
+c<> p = C(p)();

base-commit: 796ead19f85372e59217c9888db688a2fe11b54f





Re: [PATCH] c++: Fix up potential_constant_expression_1 FOR/WHILE_STMT handling [PR98672]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/15/21 11:26 AM, Jakub Jelinek wrote:

The following testcase is rejected even when it is valid.
The problem is that potential_constant_expression_1 doesn't have the
accurate *jump_target tracking cxx_eval_* has, and when the loop has
a condition that isn't guaranteed to be always true, the body isn't walked
at all.  That is mostly a correct conservative behavior, except that it
doesn't detect if there are any return statements in the body, which means
the loop might return instead of falling through to the next statement.
We already have code for return stmt discovery in code snippets we don't
try to evaluate for switches, so this patch reuses that for FOR_STMT
and WHILE_STMT bodies.


Hmm, IF_STMT probably also needs to check the else clause, if the 
condition isn't a known constant.



Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, I haven't touched FOR_EXPR, with statement expressions it could
have return stmts in it too, or it could have break or continue statements
that wouldn't bind to the current loop but to something outer.  That
case is clearly mishandled by potential_constant_expression_1 even
when the condition is missing or is always true, and it wouldn't surprise me
if cxx_eval_* didn't handle it right either, so I'm deferring that to
separate PR for later.  We'd need proper test coverage for all of that.


Agreed.


2021-01-15  Jakub Jelinek  

PR c++/98672
* constexpr.c (potential_constant_expression_1) ,
: If the condition isn't constant true, check if
the loop body can contain a return stmt.

* g++.dg/cpp1y/constexpr-98672.C: New test.

--- gcc/cp/constexpr.c.jj   2021-01-13 19:19:44.368469462 +0100
+++ gcc/cp/constexpr.c  2021-01-14 12:02:27.347042704 +0100
@@ -8190,7 +8190,17 @@ potential_constant_expression_1 (tree t,
  /* If we couldn't evaluate the condition, it might not ever be
 true.  */
  if (!integer_onep (tmp))
-   return true;
+   {
+ /* Before returning true, check if the for body can contain
+a return.  */
+ hash_set pset;
+ check_for_return_continue_data data = { &pset, NULL_TREE };
+ if (tree ret_expr
+ = cp_walk_tree (&FOR_BODY (t), check_for_return_continue,
+ &data, &pset))
+   *jump_target = ret_expr;
+ return true;
+   }
}
if (!RECUR (FOR_EXPR (t), any))
return false;
@@ -8219,7 +8229,17 @@ potential_constant_expression_1 (tree t,
tmp = cxx_eval_outermost_constant_expr (tmp, true);
/* If we couldn't evaluate the condition, it might not ever be true.  */
if (!integer_onep (tmp))
-   return true;
+   {
+ /* Before returning true, check if the while body can contain
+a return.  */
+ hash_set pset;
+ check_for_return_continue_data data = { &pset, NULL_TREE };
+ if (tree ret_expr
+ = cp_walk_tree (&WHILE_BODY (t), check_for_return_continue,
+ &data, &pset))
+   *jump_target = ret_expr;
+ return true;
+   }
if (!RECUR (WHILE_BODY (t), any))
return false;
if (breaks (jump_target) || continues (jump_target))
--- gcc/testsuite/g++.dg/cpp1y/constexpr-98672.C.jj 2021-01-14 
12:19:24.842438847 +0100
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-98672.C2021-01-14 
12:07:33.935551155 +0100
@@ -0,0 +1,35 @@
+// PR c++/98672
+// { dg-do compile { target c++14 } }
+
+void
+foo ()
+{
+}
+
+constexpr int
+bar ()
+{
+  for (int i = 0; i < 5; ++i)
+return i;
+  foo ();
+  return 0;
+}
+
+constexpr int
+baz ()
+{
+  int i = 0;
+  while (i < 5)
+{
+  if (i == 3)
+   return i;
+  else
+   ++i;
+}
+  foo ();
+  return 0;
+}
+
+constexpr int i = bar ();
+constexpr int j = baz ();
+static_assert (i == 0 && j == 3, "");

Jakub





Re: [PATCH] c++: Fix excessive instantiation inside decltype [PR71879]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/18/21 12:31 AM, Patrick Palka wrote:

Here after resolving the address of a template-id inside decltype, we
end up instantiating the chosen specialization from the call to
mark_used in resolve_nondeduced_context, even though only its type is
needed.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look for
trunk?

gcc/cp/ChangeLog:

PR c++/71879
* semantics.c (finish_decltype_type): Temporarily increment
cp_unevaluated_operand during call to resolve_nondeduced_context.

gcc/testsuite/ChangeLog:

PR c++/71879
* g++.dg/cpp0x/decltype-71879.C: New test.
---
  gcc/cp/semantics.c  | 2 ++
  gcc/testsuite/g++.dg/cpp0x/decltype-71879.C | 5 +
  2 files changed, 7 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype-71879.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index c8a6283b120..cad55665ce8 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -10098,7 +10098,9 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
  
/* The type denoted by decltype(e) is defined as follows:  */
  
+  ++cp_unevaluated_operand;

expr = resolve_nondeduced_context (expr, complain);
+  --cp_unevaluated_operand;


Hmm, is there a reason not to have cp_unevaluated_operand set through 
the whole function?  We might stick 'cp_unevaluated u;' at the top of 
the function and remove the existing messing with 
cp_unevaluated_operand.  Or assert that it's set and fix the callers to 
leave it set longer.



if (invalid_nonstatic_memfn_p (input_location, expr, complain))
  return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype-71879.C 
b/gcc/testsuite/g++.dg/cpp0x/decltype-71879.C
new file mode 100644
index 000..9da4d40ca70
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype-71879.C
@@ -0,0 +1,5 @@
+// PR c++/71879
+// { dg-do compile { target c++11 } }
+
+template  void f(T x) { x.fail(); }
+using R = decltype(&f);





Re: [PATCH] c++: Defer access checking when processing bases [PR82613]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/18/21 12:31 AM, Patrick Palka wrote:

When parsing the base-clause of a class declaration, we need to defer
access checking until the entire base-clause has been seen, so that
access can be properly checked relative to the scope of the class with
all its bases attached.  This allows us to accept the declaration of
struct D from Example 2 of [class.access.general] (access12.C below).

Similarly when substituting into the base-clause of a class template,
which is the subject of PR82613.

Bootstrapped and regtested on x86_64-pc-linxu-gnu, does this look OK for
trunk?


OK.


gcc/cp/ChangeLog:

PR c++/82613
* parser.c (cp_parser_class_head): Defer access checking when
parsing the base-clause until all bases are seen and attached
to the class type.
* pt.c (instantiate_class_template): Likewise when substituting
into dependent bases.

gcc/testsuite/ChangeLog:

PR c++/82613
* g++.dg/parse/access12.C: New test.
* g++.dg/template/access35.C: New test.
---
  gcc/cp/parser.c  | 30 ++--
  gcc/cp/pt.c  | 16 ++---
  gcc/testsuite/g++.dg/parse/access12.C| 24 +++
  gcc/testsuite/g++.dg/template/access35.C | 26 
  4 files changed, 75 insertions(+), 21 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/parse/access12.C
  create mode 100644 gcc/testsuite/g++.dg/template/access35.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 88c6e2648cb..57843cd65c6 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -25578,19 +25578,11 @@ cp_parser_class_head (cp_parser* parser,
  
   is valid.  */
  
-  /* Get the list of base-classes, if there is one.  */

+  /* Get the list of base-classes, if there is one.  Defer access checking
+ until the entire list has been seen, as per [class.access.general].  */
+  push_deferring_access_checks (dk_deferred);
if (cp_lexer_next_token_is (parser->lexer, CPP_COLON))
-{
-  /* PR59482: enter the class scope so that base-specifiers are looked
-up correctly.  */
-  if (type)
-   pushclass (type);
-  bases = cp_parser_base_clause (parser);
-  /* PR59482: get out of the previously pushed class scope so that the
-subsequent pops pop the right thing.  */
-  if (type)
-   popclass ();
-}
+bases = cp_parser_base_clause (parser);
else
  bases = NULL_TREE;
  
@@ -25599,6 +25591,20 @@ cp_parser_class_head (cp_parser* parser,

if (type && cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
  xref_basetypes (type, bases);
  
+  /* Now that all bases have been seen and attached to the class, check

+ accessibility of the types named in the base-clause.  This must be
+ done relative to the class scope, so that we accept e.g.
+
+   class A { protected: struct B {}; };
+   struct C : A::B, A {}; // OK: A::B is accessible from C
+
+ as per [class.access.general].  */
+  if (type)
+pushclass (type);
+  pop_to_parent_deferring_access_checks ();
+  if (type)
+popclass ();
+
   done:
/* Leave the scope given by the nested-name-specifier.  We will
   enter the class scope itself while processing the members.  */
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index a82324d23be..d5d3d2fd040 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -11825,17 +11825,14 @@ instantiate_class_template_1 (tree type)
  || COMPLETE_OR_OPEN_TYPE_P (TYPE_CONTEXT (type)));
  
base_list = NULL_TREE;

+  /* Defer access checking while we substitute into the types named in
+ the base-clause.  */
+  push_deferring_access_checks (dk_deferred);
if (BINFO_N_BASE_BINFOS (pbinfo))
  {
tree pbase_binfo;
-  tree pushed_scope;
int i;
  
-  /* We must enter the scope containing the type, as that is where

-the accessibility of types named in dependent bases are
-looked up from.  */
-  pushed_scope = push_scope (CP_TYPE_CONTEXT (type));
-
/* Substitute into each of the bases to determine the actual
 basetypes.  */
for (i = 0; BINFO_BASE_ITERATE (pbinfo, i, pbase_binfo); i++)
@@ -11877,9 +11874,6 @@ instantiate_class_template_1 (tree type)
  
/* The list is now in reverse order; correct that.  */

base_list = nreverse (base_list);
-
-  if (pushed_scope)
-   pop_scope (pushed_scope);
  }
/* Now call xref_basetypes to set up all the base-class
   information.  */
@@ -11897,6 +11891,10 @@ instantiate_class_template_1 (tree type)
   class, except we also need to push the enclosing classes.  */
push_nested_class (type);
  
+  /* Now check accessibility of the types named in its base-clause,

+ relative to the scope of the class.  */
+  pop_to_parent_deferring_access_checks ();
+
/* Now members are processed in the order of declaration.  */
for (member = CLASSTYPE_DECL_LIST (pattern);
 membe

Re: [PATCH] c++: Fix tsubsting CLASS_PLACEHOLDER_TEMPLATE [PR95434]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/19/21 11:29 AM, Patrick Palka wrote:

Here, during partial instantiation of the generic lambda, tsubst_copy on
the CLASS_PLACEHOLDER_TEMPLATE of the CTAD placeholder U{0} yields a
(level-lowered) TEMPLATE_TEMPLATE_PARM rather than the corresponding
TEMPLATE_DECL.  This later confuses do_class_deduction which expects
that the CLASS_PLACEHOLDER_TEMPLATE is always a TEMPLATE_DECL.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


gcc/cp/ChangeLog:

PR c++/95434
* pt.c (tsubst) : If tsubsting
CLASS_PLACEHOLDER_TEMPLATE yields a TEMPLATE_TEMPLATE_PARM,
adjust to its TEMPLATE_TEMPLATE_PARM_TEMPLATE_DECL.

gcc/testsuite/ChangeLog:

PR c++/95434
* g++.dg/cpp2a/lambda-generic9.C: New test.
---
  gcc/cp/pt.c  | 2 ++
  gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C | 9 +
  2 files changed, 11 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index d5d3d2fd040..2fed81520e3 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -15688,6 +15688,8 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
else if (tree pl = CLASS_PLACEHOLDER_TEMPLATE (t))
  {
pl = tsubst_copy (pl, args, complain, in_decl);
+   if (TREE_CODE (pl) == TEMPLATE_TEMPLATE_PARM)
+ pl = TEMPLATE_TEMPLATE_PARM_TEMPLATE_DECL (pl);
CLASS_PLACEHOLDER_TEMPLATE (r) = pl;
  }
  }
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C
new file mode 100644
index 000..20ceb370c38
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic9.C
@@ -0,0 +1,9 @@
+// PR c++/95434
+// { dg-do compile { target c++20 } }
+
+template 
+void f() {
+  []  class U> { U{0}; };
+}
+
+template void f();





Re: [PATCH v2] c++: Crash when deducing template arguments [PR98659]

2021-01-19 Thread Marek Polacek via Gcc-patches
On Tue, Jan 19, 2021 at 03:41:57PM -0500, Jason Merrill wrote:
> On 1/13/21 1:38 PM, Marek Polacek wrote:
> > maybe_instantiate_noexcept doesn't expect to see error_mark_node, so
> > the new callsite I introduced in r11-6476 needs to be properly guarded.
> 
> I'd rather fix maybe_instantiate_noexcept to deal with error_mark_node.

Ok, here's v2.  I've checked all maybe_instantiate_noexcept calls to see
if they don't need to be guarded by != error_mark_node anymore but found
none.

Ok for trunk if the usual testing passes?

-- >8 --
maybe_instantiate_noexcept doesn't expect to see error_mark_node, but
the new callsite I introduced in r11-6476 can pass error_mark_node to
it.  So cope.

gcc/cp/ChangeLog:

PR c++/98659
* pt.c (maybe_instantiate_noexcept): Return false if FN is
error_mark_node.

gcc/testsuite/ChangeLog:

PR c++/98659
* g++.dg/template/deduce8.C: New test.
---
 gcc/cp/pt.c |  9 +
 gcc/testsuite/g++.dg/template/deduce8.C | 21 +
 2 files changed, 26 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/deduce8.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 957140115e4..aa7a155815a 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -25456,7 +25456,8 @@ always_instantiate_p (tree decl)
 bool
 maybe_instantiate_noexcept (tree fn, tsubst_flags_t complain)
 {
-  tree fntype, spec, noex;
+  if (fn == error_mark_node)
+return false;
 
   /* Don't instantiate a noexcept-specification from template context.  */
   if (processing_template_decl
@@ -25475,13 +25476,13 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
complain)
   return !DECL_MAYBE_DELETED (fn);
 }
 
-  fntype = TREE_TYPE (fn);
-  spec = TYPE_RAISES_EXCEPTIONS (fntype);
+  tree fntype = TREE_TYPE (fn);
+  tree spec = TYPE_RAISES_EXCEPTIONS (fntype);
 
   if (!spec || !TREE_PURPOSE (spec))
 return true;
 
-  noex = TREE_PURPOSE (spec);
+  tree noex = TREE_PURPOSE (spec);
   if (TREE_CODE (noex) != DEFERRED_NOEXCEPT
   && TREE_CODE (noex) != DEFERRED_PARSE)
 return true;
diff --git a/gcc/testsuite/g++.dg/template/deduce8.C 
b/gcc/testsuite/g++.dg/template/deduce8.C
new file mode 100644
index 000..430be426689
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/deduce8.C
@@ -0,0 +1,21 @@
+// PR c++/98659
+// { dg-do compile }
+
+template  struct enable_if;
+struct function {
+  template  void operator=(_F);
+};
+struct map {
+  function operator[](int);
+};
+enum { E };
+template  void foo ();
+template 
+typename enable_if::type foo ();
+
+void
+bar ()
+{
+  map m;
+  m[E] = foo;
+}

base-commit: c37f1d4081f5a19e39192d13e2a3acea13662e5a
-- 
2.29.2



Re: [PATCH v2] c++: ICE with USING_DECL redeclaration [PR98687]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/19/21 3:47 PM, Marek Polacek wrote:

On Mon, Jan 18, 2021 at 05:18:46PM -0500, Jason Merrill wrote:

On 1/15/21 12:26 AM, Marek Polacek wrote:

My recent patch that introduced push_using_decl_bindings didn't
handle USING_DECL redeclaration, therefore things broke.  This
patch amends that.  Note that I don't know if the other parts of
finish_nonmember_using_decl are needed (e.g. the binding->type
setting) -- I couldn't trigger it by any of my hand-made testcases.


I'd expect it to be exercised by something along the lines of

struct A { };

void f()
{
   int A;
   using ::A;
   struct A a;
}


Hmm, I already had a test for the struct stat hack, but I've added this one
to using64.C, thanks.
  

Let's factor the code out of finish_nonmember_using_decl rather than copy
it.


Done here.  A small complication was that name_lookup is local to
name-lookup.c so I had to add an overload to handle this.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
My recent patch that introduced push_using_decl_bindings didn't
handle USING_DECL redeclaration, therefore things broke.  This patch
amends that by breaking out a part of finish_nonmember_using_decl
out to a separate function, push_using_decl_bindings, and calling it.
It needs an overload, because name_lookup is only available inside
of name-lookup.c.

gcc/cp/ChangeLog:

PR c++/98687
* name-lookup.c (push_using_decl_bindings): New, broken out of...
(finish_nonmember_using_decl): ...here.
* name-lookup.h (push_using_decl_bindings): Update declaration.
* pt.c (tsubst_expr): Update the call to push_using_decl_bindings.

gcc/testsuite/ChangeLog:

PR c++/98687
* g++.dg/lookup/using64.C: New test.
* g++.dg/lookup/using65.C: New test.
---
  gcc/cp/name-lookup.c  | 103 ++
  gcc/cp/name-lookup.h  |   2 +-
  gcc/cp/pt.c   |   3 +-
  gcc/testsuite/g++.dg/lookup/using64.C |  69 +
  gcc/testsuite/g++.dg/lookup/using65.C |  17 +
  5 files changed, 145 insertions(+), 49 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/lookup/using64.C
  create mode 100644 gcc/testsuite/g++.dg/lookup/using65.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index b4b6c0b81b5..843e5f305c0 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -6279,6 +6279,61 @@ pushdecl_namespace_level (tree x, bool hiding)
return t;
  }
  
+/* Wrapper around push_local_binding to push the bindings for

+   a non-member USING_DECL with NAME and VALUE.  LOOKUP, if non-null,
+   is the result of name lookup during template parsing.  */
+
+static void
+push_using_decl_bindings (name_lookup *lookup, tree name, tree value)
+{
+  tree type = NULL_TREE;
+
+  cxx_binding *binding = find_local_binding (current_binding_level, name);
+  if (binding)
+{
+  value = binding->value;
+  type = binding->type;
+}
+
+  /* DR 36 questions why using-decls at function scope may not be
+ duplicates.  Disallow it, as C++11 claimed and PR 20420
+ implemented.  */
+  if (lookup)
+do_nonmember_using_decl (*lookup, true, true, &value, &type);
+
+  if (!value)
+;
+  else if (binding && value == binding->value)
+/* Redeclaration of this USING_DECL.  */;
+  else if (binding && binding->value && TREE_CODE (value) == OVERLOAD)
+{
+  /* We already have this binding, so replace it.  */
+  update_local_overload (IDENTIFIER_BINDING (name), value);
+  IDENTIFIER_BINDING (name)->value = value;
+}
+  else
+/* Install the new binding.  */
+push_local_binding (name, value, /*using=*/true);
+
+  if (!type)
+;
+  else if (binding && type == binding->type)
+;
+  else
+{
+  push_local_binding (name, type, /*using=*/true);
+  set_identifier_type_value (name, type);
+}
+}
+
+/* Overload for push_using_decl_bindings that doesn't take a name_lookup.  */
+
+void
+push_using_decl_bindings (tree name, tree value)
+{
+  push_using_decl_bindings (nullptr, name, value);
+}
+
  /* Process a using declaration in non-class scope.  */
  
  void

@@ -6395,43 +6450,7 @@ finish_nonmember_using_decl (tree scope, tree name)
else
  {
add_decl_expr (using_decl);
-
-  cxx_binding *binding = find_local_binding (current_binding_level, name);
-  tree value = NULL;
-  tree type = NULL;
-  if (binding)
-   {
- value = binding->value;
- type = binding->type;
-   }
-
-  /* DR 36 questions why using-decls at function scope may not be
-duplicates.  Disallow it, as C++11 claimed and PR 20420
-implemented.  */
-  do_nonmember_using_decl (lookup, true, true, &value, &type);
-
-  if (!value)
-   ;
-  else if (binding && value == binding->value)
-   ;
-  else if (binding && binding->value && TREE_CODE (value) == OVERLOAD)
-   {
- update_local_overload (IDENTIFIER_BINDING (name), value

Re: [PATCH v2] c++: Crash when deducing template arguments [PR98659]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/19/21 4:09 PM, Marek Polacek wrote:

On Tue, Jan 19, 2021 at 03:41:57PM -0500, Jason Merrill wrote:

On 1/13/21 1:38 PM, Marek Polacek wrote:

maybe_instantiate_noexcept doesn't expect to see error_mark_node, so
the new callsite I introduced in r11-6476 needs to be properly guarded.


I'd rather fix maybe_instantiate_noexcept to deal with error_mark_node.


Ok, here's v2.  I've checked all maybe_instantiate_noexcept calls to see
if they don't need to be guarded by != error_mark_node anymore but found
none.

Ok for trunk if the usual testing passes?


OK.


-- >8 --
maybe_instantiate_noexcept doesn't expect to see error_mark_node, but
the new callsite I introduced in r11-6476 can pass error_mark_node to
it.  So cope.

gcc/cp/ChangeLog:

PR c++/98659
* pt.c (maybe_instantiate_noexcept): Return false if FN is
error_mark_node.

gcc/testsuite/ChangeLog:

PR c++/98659
* g++.dg/template/deduce8.C: New test.
---
  gcc/cp/pt.c |  9 +
  gcc/testsuite/g++.dg/template/deduce8.C | 21 +
  2 files changed, 26 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/deduce8.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 957140115e4..aa7a155815a 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -25456,7 +25456,8 @@ always_instantiate_p (tree decl)
  bool
  maybe_instantiate_noexcept (tree fn, tsubst_flags_t complain)
  {
-  tree fntype, spec, noex;
+  if (fn == error_mark_node)
+return false;
  
/* Don't instantiate a noexcept-specification from template context.  */

if (processing_template_decl
@@ -25475,13 +25476,13 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
complain)
return !DECL_MAYBE_DELETED (fn);
  }
  
-  fntype = TREE_TYPE (fn);

-  spec = TYPE_RAISES_EXCEPTIONS (fntype);
+  tree fntype = TREE_TYPE (fn);
+  tree spec = TYPE_RAISES_EXCEPTIONS (fntype);
  
if (!spec || !TREE_PURPOSE (spec))

  return true;
  
-  noex = TREE_PURPOSE (spec);

+  tree noex = TREE_PURPOSE (spec);
if (TREE_CODE (noex) != DEFERRED_NOEXCEPT
&& TREE_CODE (noex) != DEFERRED_PARSE)
  return true;
diff --git a/gcc/testsuite/g++.dg/template/deduce8.C 
b/gcc/testsuite/g++.dg/template/deduce8.C
new file mode 100644
index 000..430be426689
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/deduce8.C
@@ -0,0 +1,21 @@
+// PR c++/98659
+// { dg-do compile }
+
+template  struct enable_if;
+struct function {
+  template  void operator=(_F);
+};
+struct map {
+  function operator[](int);
+};
+enum { E };
+template  void foo ();
+template 
+typename enable_if::type foo ();
+
+void
+bar ()
+{
+  map m;
+  m[E] = foo;
+}

base-commit: c37f1d4081f5a19e39192d13e2a3acea13662e5a





Re: [PATCH] c++: ICE with constrained placeholder return type [PR98346]

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/15/21 11:37 AM, Patrick Palka wrote:

On Mon, 11 Jan 2021, Jason Merrill wrote:


On 1/7/21 4:06 PM, Patrick Palka wrote:

This is essentially a followup to r11-3714 -- we ICEing from another
"unguarded" call to build_concept_check, this time in do_auto_deduction,
due to the presence of templated trees when !processing_template_decl.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps the 10 branch?

gcc/cp/ChangeLog:

PR c++/98346
* pt.c (do_auto_deduction): Temporarily increment
processing_template_decl before calling build_concept_check.

gcc/testsuite/ChangeLog:

PR c++/98346
* g++.dg/cpp2a/concepts-placeholder3.C: New test.
---
   gcc/cp/pt.c   |  2 ++
   .../g++.dg/cpp2a/concepts-placeholder3.C  | 15 +++
   2 files changed, 17 insertions(+)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder3.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index beabcc4b027..111a694e0c5 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29464,7 +29464,9 @@ do_auto_deduction (tree type, tree init, tree
auto_node,
 cargs = targs;
/* Rebuild the check using the deduced arguments.  */
+   ++processing_template_decl;
check = build_concept_check (cdecl, cargs, tf_none);
+   --processing_template_decl;


This shouldn't be necessary; if processing_template_decl is 0, we should have
non-dependent args.

I think your patch only works for this testcase because the concept is trivial
and doesn't actually try to to do anything with the arguments.

Handling of PLACEHOLDER_TYPE_CONSTRAINTS is overly complex, partly because the
'auto' is represented as an argument in its own constraints.

A constrained auto variable declaration has the same problem.


D'oh, good point..  We need to also substitute the template arguments of
the current instantiation into the constraint at some point.   This is
actually PR96443 / PR96444, which I reported and posted a patch for back
in August: https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551375.html

The approach the August patch used was to substitute into the
PLACEHOLDER_TYPE_CONSTRAINTS during tsubst, which was ruled out.  We can
instead do the same substitution during do_auto_deduction, as in the
patch below.  Does this approach look better?  It seems consistent with
how type_deducible_p substitutes into the return-type-requirement of a
compound-requirement.

Alternatively we could not substitute into PLACEHOLDER_TYPE_CONSTRAINTS
at all and instead pass the targs of the enclosing function directly
into satisfaction, but that seems inconsistent with type_deducible_p.


That sounds better.

I think type_deducible_p is wrong; 7.5.7.3-4 make it clear that this 
should work the same as other satisfaction.



-- >8 --

Subject: [PATCH] c++: dependent constraint on placeholder return type
  [PR96443]

We're never substituting the template arguments of the enclosing
function into the constraint of a placeholder variable or return type,
which leads to errors during satisfaction when the constraint is
dependent.  This patch fixes this issue by doing the appropriate
substitution in do_auto_deduction before checking satisfaction.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?  Also tested on cmcstl2 and range-v3.

gcc/cp/ChangeLog:

PR c++/96443
* pt.c (do_auto_deduction): Try checking the placeholder
constraint template parse time.  Substitute the template
arguments of the containing function into the placeholder
constraint.  If the constraint is still dependent, defer
deduction until instantiation time.

gcc/testsuite/ChangeLog:

PR c++/96443
* g++.dg/concepts/concepts-ts1.C: Add dg-bogus directive to the
call to f15 that we expect to accept.
* g++.dg/cpp2a/concepts-placeholder3.C: New test.
---
  gcc/cp/pt.c   | 19 ++-
  .../g++.dg/cpp2a/concepts-placeholder3.C  | 16 
  gcc/testsuite/g++.dg/cpp2a/concepts-ts1.C |  2 +-
  3 files changed, 35 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder3.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c6b7318b378..b70a9a451e1 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29455,7 +29455,7 @@ do_auto_deduction (tree type, tree init, tree auto_node,
  }
  
/* Check any placeholder constraints against the deduced type. */

-  if (flag_concepts && !processing_template_decl)
+  if (flag_concepts)
  if (tree check = NON_ERROR (PLACEHOLDER_TYPE_CONSTRAINTS (auto_node)))
{
  /* Use the deduced type to check the associated constraints. If we
@@ -29475,6 +29475,23 @@ do_auto_deduction (tree type, tree init, tree 
auto_node,
  else
cargs = targs;
  
+	if ((context == adc_return_type || context == adc_variable_ty

[PATCH] c++, v2: Fix up potential_constant_expression_1 FOR/WHILE_STMT handling [PR98672]

2021-01-19 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 19, 2021 at 04:01:49PM -0500, Jason Merrill wrote:
> Hmm, IF_STMT probably also needs to check the else clause, if the condition
> isn't a known constant.

You're right, I thought it was ok because it recurses with tf_none, but
if the then branch is potentially constant and only else returns, continues
or breaks, then as the enhanced testcase shows we were mishandling it too.

So like this then if it passes bootstrap/regtest?

2021-01-19  Jakub Jelinek  

PR c++/98672
* constexpr.c (check_for_return_continue_data): Add break_stmt member.
(check_for_return_continue): Also look for BREAK_STMT.  Handle 
SWITCH_STMT
by ignoring break_stmt from its body.
(potential_constant_expression_1) ,
: If the condition isn't constant true, check if
the loop body can contain a return stmt.
: Adjust check_for_return_continue_data initializer.
: If recursion with tf_none is successful, merge
*jump_target from the branches - returns with highest priority, breaks
or continues lower.  If then branch is potentially constant and
doesn't return, check the else branch if it could return, break or
continue.

* g++.dg/cpp1y/constexpr-98672.C: New test.

--- gcc/cp/constexpr.c.jj   2021-01-14 12:49:50.500644142 +0100
+++ gcc/cp/constexpr.c  2021-01-19 22:44:17.845322567 +0100
@@ -7649,15 +7649,16 @@ check_automatic_or_tls (tree ref)
 struct check_for_return_continue_data {
   hash_set *pset;
   tree continue_stmt;
+  tree break_stmt;
 };
 
 /* Helper function for potential_constant_expression_1 SWITCH_STMT handling,
called through cp_walk_tree.  Return the first RETURN_EXPR found, or note
-   the first CONTINUE_STMT if RETURN_EXPR is not found.  */
+   the first CONTINUE_STMT and/or BREAK_STMT if RETURN_EXPR is not found.  */
 static tree
 check_for_return_continue (tree *tp, int *walk_subtrees, void *data)
 {
-  tree t = *tp, s;
+  tree t = *tp, s, b;
   check_for_return_continue_data *d = (check_for_return_continue_data *) data;
   switch (TREE_CODE (t))
 {
@@ -7669,6 +7670,11 @@ check_for_return_continue (tree *tp, int
d->continue_stmt = t;
   break;
 
+case BREAK_STMT:
+  if (d->break_stmt == NULL_TREE)
+   d->break_stmt = t;
+  break;
+
 #define RECUR(x) \
   if (tree r = cp_walk_tree (&x, check_for_return_continue, data,  \
 d->pset))  \
@@ -7680,16 +7686,20 @@ check_for_return_continue (tree *tp, int
   *walk_subtrees = 0;
   RECUR (DO_COND (t));
   s = d->continue_stmt;
+  b = d->break_stmt;
   RECUR (DO_BODY (t));
   d->continue_stmt = s;
+  d->break_stmt = b;
   break;
 
 case WHILE_STMT:
   *walk_subtrees = 0;
   RECUR (WHILE_COND (t));
   s = d->continue_stmt;
+  b = d->break_stmt;
   RECUR (WHILE_BODY (t));
   d->continue_stmt = s;
+  d->break_stmt = b;
   break;
 
 case FOR_STMT:
@@ -7698,16 +7708,28 @@ check_for_return_continue (tree *tp, int
   RECUR (FOR_COND (t));
   RECUR (FOR_EXPR (t));
   s = d->continue_stmt;
+  b = d->break_stmt;
   RECUR (FOR_BODY (t));
   d->continue_stmt = s;
+  d->break_stmt = b;
   break;
 
 case RANGE_FOR_STMT:
   *walk_subtrees = 0;
   RECUR (RANGE_FOR_EXPR (t));
   s = d->continue_stmt;
+  b = d->break_stmt;
   RECUR (RANGE_FOR_BODY (t));
   d->continue_stmt = s;
+  d->break_stmt = b;
+  break;
+
+case SWITCH_STMT:
+  *walk_subtrees = 0;
+  RECUR (SWITCH_STMT_COND (t));
+  b = d->break_stmt;
+  RECUR (SWITCH_STMT_BODY (t));
+  d->break_stmt = b;
   break;
 #undef RECUR
 
@@ -8190,7 +8212,18 @@ potential_constant_expression_1 (tree t,
  /* If we couldn't evaluate the condition, it might not ever be
 true.  */
  if (!integer_onep (tmp))
-   return true;
+   {
+ /* Before returning true, check if the for body can contain
+a return.  */
+ hash_set pset;
+ check_for_return_continue_data data = { &pset, NULL_TREE,
+ NULL_TREE };
+ if (tree ret_expr
+ = cp_walk_tree (&FOR_BODY (t), check_for_return_continue,
+ &data, &pset))
+   *jump_target = ret_expr;
+ return true;
+   }
}
   if (!RECUR (FOR_EXPR (t), any))
return false;
@@ -8219,7 +8252,18 @@ potential_constant_expression_1 (tree t,
tmp = cxx_eval_outermost_constant_expr (tmp, true);
   /* If we couldn't evaluate the condition, it might not ever be true.  */
   if (!integer_onep (tmp))
-   return true;
+   {
+ /* Before returning true, check if the while body can contain
+a return.  */
+ hash_set pset;
+ check_for_return_continue_data dat

Go patch committed: Initialize variables with go:embed directives

2021-01-19 Thread Ian Lance Taylor via Gcc-patches
This Go frontend patch initializes variables with go:embed directives.
This completes the compiler work for go:embed.  Bootstrapped and ran
Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
eed40bca6f2eb3af0c811cf6ec9e123c5bf4907d
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index fb4ec30913e..f67c30a5d3a 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-9e78cef2b689aa586dbf677fb47ea3f08f197b91
+83eea1930671ce2bba863582a67f2609bc4f9f36
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/embed.cc b/gcc/go/gofrontend/embed.cc
index 7ee86746212..bea1003bc08 100644
--- a/gcc/go/gofrontend/embed.cc
+++ b/gcc/go/gofrontend/embed.cc
@@ -9,6 +9,8 @@
 #include "operator.h"
 #include "go-diagnostics.h"
 #include "lex.h"
+#include "types.h"
+#include "expressions.h"
 #include "gogo.h"
 
 #ifndef O_BINARY
@@ -301,7 +303,41 @@ Gogo::read_embedcfg(const char *filename)
   return;
 }
 
-  // TODO: Actually do something with patterns and files.
+  for (Json_value::map_iterator p = patterns->map_begin();
+   p != patterns->map_end();
+   ++p)
+{
+  if (p->second->classification() != Json_value::JSON_VALUE_ARRAY)
+   {
+ r.error("invalid embedcfg: Patterns entry is not an array");
+ return;
+   }
+  std::vector files;
+  p->second->get_and_clear_array(&files);
+
+  std::pair > val;
+  val.first = p->first;
+  std::pair ins =
+   this->embed_patterns_.insert(val);
+  if (!ins.second)
+   {
+ r.error("invalid embedcfg: duplicate Patterns entry");
+ return;
+   }
+  std::swap(ins.first->second, files);
+}
+
+  for (Json_value::map_iterator p = files->map_begin();
+   p != files->map_end();
+   ++p)
+{
+  if (p->second->classification() != Json_value::JSON_VALUE_STRING)
+   {
+ r.error("invalid embedcfg: Files entry is not a string");
+ return;
+   }
+  this->embed_files_[p->first] = p->second->to_string();
+}
 }
 
 // Read the contents of FILENAME into this->data_.  Returns whether it
@@ -641,3 +677,287 @@ Gogo::is_embed_imported() const
   // the package has been imported if there is at least one alias.
   return !p->second->aliases().empty();
 }
+
+// Implement the sort order for a list of embedded files, as discussed
+// at the docs for embed.FS.
+
+class Embedfs_sort
+{
+ public:
+  bool
+  operator()(const std::string& p1, const std::string& p2) const;
+
+ private:
+  void
+  split(const std::string&, size_t*, size_t*, size_t*) const;
+};
+
+bool
+Embedfs_sort::operator()(const std::string& p1, const std::string& p2) const
+{
+  size_t dirlen1, elem1, elemlen1;
+  this->split(p1, &dirlen1, &elem1, &elemlen1);
+  size_t dirlen2, elem2, elemlen2;
+  this->split(p2, &dirlen2, &elem2, &elemlen2);
+
+  if (dirlen1 == 0)
+{
+  if (dirlen2 > 0)
+   {
+ int i = p2.compare(0, dirlen2, ".");
+ if (i != 0)
+   return i > 0;
+   }
+}
+  else if (dirlen2 == 0)
+{
+  int i = p1.compare(0, dirlen1, ".");
+  if (i != 0)
+   return i < 0;
+}
+  else
+{
+  int i = p1.compare(0, dirlen1, p2, 0, dirlen2);
+  if (i != 0)
+   return i < 0;
+}
+
+  int i = p1.compare(elem1, elemlen1, p2, elem2, elemlen2);
+  return i < 0;
+}
+
+// Pick out the directory and file name components for comparison.
+
+void
+Embedfs_sort::split(const std::string& s, size_t* dirlen, size_t* elem,
+   size_t* elemlen) const
+{
+  size_t len = s.size();
+  if (len > 0 && s[len - 1] == '/')
+--len;
+  size_t slash = s.rfind('/', len - 1);
+  if (slash == std::string::npos)
+{
+  *dirlen = 0;
+  *elem = 0;
+  *elemlen = len;
+}
+  else
+{
+  *dirlen = slash;
+  *elem = slash + 1;
+  *elemlen = len - (slash + 1);
+}
+}
+
+// Convert the go:embed directives for a variable into an initializer
+// for that variable.
+
+Expression*
+Gogo::initializer_for_embeds(Type* type,
+const std::vector* embeds,
+Location loc)
+{
+  if (this->embed_patterns_.empty())
+{
+  go_error_at(loc,
+ ("invalid go:embed: build system did not "
+  "supply embed configuration"));
+  return Expression::make_error(loc);
+}
+
+  type = type->unalias();
+
+  enum {
+EMBED_STRING = 0,
+EMBED_BYTES = 1,
+EMBED_FS = 2
+  } embed_kind;
+
+  const Named_type* nt = type->named_type();
+  if (nt != NULL
+  && nt->named_object()->package() != NULL
+  && nt->named_object()->package()->pkgpath() == "embed"
+  && nt->name() == "FS")
+embed_kind = EMBED_FS;
+  else if (type->is_string_type())
+embed_kind = EMBED_STRING;
+  else if (type->is_slice_type()
+  && type->array_type()->element_type()->integer_type() != NULL
+ 

[PATCH 1/6 ver 3] rs6000, Fix arguments in altivec_vrlwmi and altivec_rlwdi builtins

2021-01-19 Thread Carl Love via Gcc-patches
Will, Segher:

This patch fixes the order of the argument in the vec_rlmi and
vec_rlnm builtins.  The patch also adds a new test cases to verify
the fix.

The patch has been tested on
powerpc64-linux instead (Power 8 BE)
powerpc64-linux instead (Power 9 LE)
powerpc64-linux instead (Power 10 LE)

Please let me know if the patch is acceptable for mainline.

   Carl Love

--

gcc/ChangeLog

2021-01-12  Carl Love  

gcc/
* config/rs6000/altivec.md (altivec_vrlmi): Fix
bug in argument generation.

gcc/testsuite/
gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c:
New runnable test case.
gcc.target/powerpc/vec-rlmi-rlnm.c: Update scan assembler times
for xxlor instruction.
---
 gcc/config/rs6000/altivec.md  |   6 +-
 .../powerpc/check-builtin-vec_rlnm-runnable.c | 233 ++
 .../gcc.target/powerpc/vec-rlmi-rlnm.c|   2 +-
 3 files changed, 237 insertions(+), 4 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index fc19a8fc807..4d08cca2228 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1982,12 +1982,12 @@
 
 (define_insn "altivec_vrlmi"
   [(set (match_operand:VIlong 0 "register_operand" "=v")
-(unspec:VIlong [(match_operand:VIlong 1 "register_operand" "0")
-   (match_operand:VIlong 2 "register_operand" "v")
+(unspec:VIlong [(match_operand:VIlong 1 "register_operand" "v")
+   (match_operand:VIlong 2 "register_operand" "0")
(match_operand:VIlong 3 "register_operand" "v")]
   UNSPEC_VRLMI))]
   "TARGET_P9_VECTOR"
-  "vrlmi %0,%2,%3"
+  "vrlmi %0,%1,%3"
   [(set_attr "type" "veclogical")])
 
 (define_insn "altivec_vrlnm"
diff --git a/gcc/testsuite/gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c
new file mode 100644
index 000..b97bc519c87
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c
@@ -0,0 +1,233 @@
+/* { dg-do run } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9 -save-temps" } */
+
+/* Verify the vec_rlm and vec_rlmi builtins works correctly.  */
+/* { dg-final { scan-assembler-times {\mvrldmi\M} 1 } } */
+
+#include 
+
+#define DEBUG 1
+
+#if DEBUG
+#include 
+#include 
+#endif
+
+void abort (void);
+
+int main ()
+{
+  int i;
+
+  vector unsigned int vec_arg1_int, vec_arg2_int, vec_arg3_int;
+  vector unsigned int vec_result_int, vec_expected_result_int;
+  
+  vector unsigned long long int vec_arg1_di, vec_arg2_di, vec_arg3_di;
+  vector unsigned long long int vec_result_di, vec_expected_result_di;
+
+  unsigned int mask_begin, mask_end, shift;
+  unsigned long long int mask;
+
+/* Check vec int version of vec_rlmi builtin */
+  mask = 0;
+  mask_begin = 0;
+  mask_end   = 4;
+  shift = 16;
+
+  for (i = 0; i < 31; i++)
+if ((i >= mask_begin) && (i <= mask_end))
+  mask |= 0x8000ULL >> i;
+
+  for (i = 0; i < 4; i++) {
+vec_arg1_int[i] = 0x12345678 + i*0x;
+vec_arg2_int[i] = 0xA1B1CDEF;
+vec_arg3_int[i] = mask_begin << 16 | mask_end << 8 | shift;
+
+/* do rotate */
+vec_expected_result_int[i] =  ( vec_arg2_int[i] & ~mask) 
+  | ((vec_arg1_int[i] << shift) | (vec_arg1_int[i] >> (32-shift))) & mask;
+  
+  }
+
+  /* vec_rlmi(arg1, arg2, arg3)
+ result - rotate each element of arg1 left and inserting it into arg2 
+   element of arg2 based on the mask specified in arg3.  The shift, mask
+   start and end is specified in arg3.  */
+  vec_result_int = vec_rlmi (vec_arg1_int, vec_arg2_int, vec_arg3_int);
+
+  for (i = 0; i < 4; i++) {
+if (vec_result_int[i] != vec_expected_result_int[i])
+#if DEBUG
+  printf("ERROR: i = %d, vec_rlmi int result 0x%x, does not match "
+"expected result 0x%x\n", i, vec_result_int[i],
+vec_expected_result_int[i]);
+#else
+  abort();
+#endif
+}
+
+/* Check vec long long int version of vec_rlmi builtin */
+  mask = 0;
+  mask_begin = 0;
+  mask_end   = 4;
+  shift = 16;
+
+  for (i = 0; i < 31; i++)
+if ((i >= mask_begin) && (i <= mask_end))
+  mask |= 0x8000ULL >> i;
+
+  for (i = 0; i < 2; i++) {
+vec_arg1_di[i] = 0x12345678 + i*0x;
+vec_arg2_di[i] = 0xA1B1C1D1E1F12345;
+vec_arg3_di[i] = mask_begin << 16 | mask_end << 8 | shift;
+
+/* do rotate */
+vec_expected_result_di[i] =  ( vec_arg2_di[i] & ~mask) 
+  | ((vec_arg1_di[i] << shift) | (vec_arg1_di[i] >> (64-shift))) & mask;
+  }
+
+  /* vec_rlmi(arg1, arg2, arg3)
+ result - rotate each element of arg1 left and inserting it into arg2 
+   element of arg2 based on the 

[PATCH 5/6 ver 3] rs6000, Add test 128-bit shifts for just the int128 type.

2021-01-19 Thread Carl Love via Gcc-patches
Will, Segher:

Patch 4 adds the vector 128-bit integer shift instruction support for
the V1TI type.  This patch also renames and moves the VSX_TI iterator
from vsx.md to VEC_TI in vector.md.  The uses of VEC_TI are also
updated.

This patch also renames and moves the VSX_TI iterator from vsx.md to
VEC_TI in vector.md.  The uses of VEC_TI are also updated.

version 3:
  No additional functional changes.
  Tested on Power 8BE, Power 9, Power 10.
  
version 2:
  Re-tested the patch on Power 9 with no regression errors.

Carl Love



gcc/ChangeLog

2021-01-12  Carl Love  
* config/rs6000/altivec.md (altivec_vslq, altivec_vsrq):
Rename to altivec_vslq_, altivec_vsrq_, mode VEC_TI.
* config/rs6000/vector.md (VEC_TI): Was named VSX_TI in vsx.md.
(vashlv1ti3): Change to vashl3, mode VEC_TI.
(vlshrv1ti3): Change to vlshr3, mode VEC_TI.
* config/rs6000/vsx.md (VSX_TI): Remove define_mode_iterator. Update
uses of VSX_TI to VEC_TI.

gcc/testsuite/ChangeLog

2021-01-12  Carl Love  
gcc.target/powerpc/int_128bit-runnable.c: Add shift_right, shift_left
tests.
---
 gcc/config/rs6000/altivec.md  | 16 -
 gcc/config/rs6000/vector.md   | 27 ---
 gcc/config/rs6000/vsx.md  | 33 +--
 .../gcc.target/powerpc/int_128bit-runnable.c  | 16 +++--
 4 files changed, 52 insertions(+), 40 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index cb83c5ce012..61ab5c9afb6 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2221,10 +2221,10 @@
   "vsl %0,%1,%2"
   [(set_attr "type" "vecsimple")])
 
-(define_insn "altivec_vslq"
-  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
-   (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
-(match_operand:V1TI 2 "vsx_register_operand" "v")))]
+(define_insn "altivec_vslq_"
+  [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v")
+   (ashift:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand" "v")
+(match_operand:VEC_TI 2 "vsx_register_operand" "v")))]
   "TARGET_POWER10"
   /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */
   "vslq %0,%1,%2"
@@ -2238,10 +2238,10 @@
   "vsr %0,%1,%2"
   [(set_attr "type" "vecsimple")])
 
-(define_insn "altivec_vsrq"
-  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
-   (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
-  (match_operand:V1TI 2 "vsx_register_operand" "v")))]
+(define_insn "altivec_vsrq_"
+  [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v")
+   (lshiftrt:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand" "v")
+  (match_operand:VEC_TI 2 "vsx_register_operand" 
"v")))]
   "TARGET_POWER10"
   /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */
   "vsrq %0,%1,%2"
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 0f252c915b0..6a4cd69d866 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -26,6 +26,9 @@
 ;; Vector int modes
 (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI])
 
+;; 128-bit int modes
+(define_mode_iterator VEC_TI [V1TI TI])
+
 ;; Vector int modes for parity
 (define_mode_iterator VEC_IP [V8HI
  V4SI
@@ -1627,17 +1630,17 @@
   "")
 
 ;; No immediate version of this 128-bit instruction
-(define_expand "vashlv1ti3"
-  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
-   (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
-(match_operand:V1TI 2 "vsx_register_operand" "v")))]
+(define_expand "vashl3"
+  [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v")
+   (ashift:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand")
+(match_operand:VEC_TI 2 "vsx_register_operand")))]
   "TARGET_POWER10"
 {
   /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */
-  rtx tmp = gen_reg_rtx (V1TImode);
+  rtx tmp = gen_reg_rtx (mode);
 
   emit_insn (gen_xxswapd_v1ti (tmp, operands[2]));
-  emit_insn (gen_altivec_vslq (operands[0], operands[1], tmp));
+  emit_insn(gen_altivec_vslq_ (operands[0], operands[1], tmp));
   DONE;
 })
 
@@ -1650,17 +1653,17 @@
   "")
 
 ;; No immediate version of this 128-bit instruction
-(define_expand "vlshrv1ti3"
-  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
-   (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
-  (match_operand:V1TI 2 "vsx_register_operand" "v")))]
+(define_expand "vlshr3"
+  [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v")
+   (lshiftrt:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand")
+  (match_operand:VEC_TI 2 "vsx_register_operand")))]
   "TARGET_POWE

[PATCH 2/6 ver 3] RS6000 Add 128-bit Binary Integer sign extend operations

2021-01-19 Thread Carl Love via Gcc-patches
Will, Segher:

Patch 1, adds the 128-bit sign extension instruction support and
corresponding builtin support.

version 3:

  doc/extend.texi:  Fixed the "uThe" typo and added the colon at the
end of the line.

  p9-sign_extend-runnable.c: Changed the dg-do run to  *-*-linux 
 instead of powerpc*-*-linux.

  Tested on Power 8BE, Power9, Power10.

version 2:

  Removed the blank line per Will's latest feedback.

  Retested the patch on Power 9 with no regression errors.

Carl Love

--

gcc/ChangeLog

2021-01-12  Carl Love  
* config/rs6000/altivec.h (vec_signextll, vec_signexti): Add define
for new builtins.
* config/rs6000/rs6000-builtin.def (VSIGNEXTI, VSIGNEXTLL):  Add
overloaded builtin definitions.
(VSIGNEXTSB2W, VSIGNEXTSH2W, VSIGNEXTSB2D, VSIGNEXTSH2D,VSIGNEXTSW2D):
Add builtin expansions.
* config/rs6000-call.c (P9V_BUILTIN_VEC_VSIGNEXTI,
P9V_BUILTIN_VEC_VSIGNEXTLL): Add overloaded argument definitions.
* config/rs6000/vsx.md: Make define_insn vsx_sign_extend_si_v2di
visible.
* doc/extend.texi:  Add documentation for the vec_signexti and
vec_signextll builtins.

gcc/testsuite/ChangeLog

2021-01-12  Carl Love  
* gcc.target/powerpc/p9-sign_extend-runnable.c:  New test case.
---
 gcc/config/rs6000/altivec.h   |   2 +
 gcc/config/rs6000/rs6000-builtin.def  |   9 ++
 gcc/config/rs6000/rs6000-call.c   |  13 ++
 gcc/config/rs6000/vsx.md  |   2 +-
 gcc/doc/extend.texi   |  15 ++
 .../powerpc/p9-sign_extend-runnable.c | 128 ++
 6 files changed, 168 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 06f0d4d9f14..460310a5132 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -497,6 +497,8 @@
 
 #define vec_xlx __builtin_vec_vextulx
 #define vec_xrx __builtin_vec_vexturx
+#define vec_signexti  __builtin_vec_vsignexti
+#define vec_signextll __builtin_vec_vsignextll
 
 #endif
 
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 8aa31ad0a06..842f07196de 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2800,6 +2800,8 @@ BU_P9V_OVERLOAD_1 (VPRTYBD,   "vprtybd")
 BU_P9V_OVERLOAD_1 (VPRTYBQ,"vprtybq")
 BU_P9V_OVERLOAD_1 (VPRTYBW,"vprtybw")
 BU_P9V_OVERLOAD_1 (VPARITY_LSBB,   "vparity_lsbb")
+BU_P9V_OVERLOAD_1 (VSIGNEXTI,  "vsignexti")
+BU_P9V_OVERLOAD_1 (VSIGNEXTLL, "vsignextll")
 
 /* 2 argument functions added in ISA 3.0 (power9).  */
 BU_P9_2 (CMPRB,"byte_in_range",CONST,  cmprb)
@@ -2811,6 +2813,13 @@ BU_P9_OVERLOAD_2 (CMPRB, "byte_in_range")
 BU_P9_OVERLOAD_2 (CMPRB2,  "byte_in_either_range")
 BU_P9_OVERLOAD_2 (CMPEQB,  "byte_in_set")
 
+
+BU_P9V_AV_1 (VSIGNEXTSB2W, "vsignextsb2w", CONST,  
vsx_sign_extend_qi_v4si)
+BU_P9V_AV_1 (VSIGNEXTSH2W, "vsignextsh2w", CONST,  
vsx_sign_extend_hi_v4si)
+BU_P9V_AV_1 (VSIGNEXTSB2D, "vsignextsb2d", CONST,  
vsx_sign_extend_qi_v2di)
+BU_P9V_AV_1 (VSIGNEXTSH2D, "vsignextsh2d", CONST,  
vsx_sign_extend_hi_v2di)
+BU_P9V_AV_1 (VSIGNEXTSW2D, "vsignextsw2d", CONST,  
vsx_sign_extend_si_v2di)
+
 /* Builtins for scalar instructions added in ISA 3.1 (power10).  */
 BU_P10_POWERPC64_MISC_2 (CFUGED, "cfuged", CONST, cfuged)
 BU_P10_POWERPC64_MISC_2 (CNTLZDM, "cntlzdm", CONST, cntlzdm)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 2308cc8b4a2..3af325317a1 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -5660,6 +5660,19 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
 RS6000_BTI_INTSI, RS6000_BTI_INTSI },
 
+  /* Sign extend builtins that work work on ISA 3.0, not added until ISA 3.1 */
+  { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSB2W,
+RS6000_BTI_V4SI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSH2W,
+RS6000_BTI_V4SI, RS6000_BTI_V8HI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSB2D,
+RS6000_BTI_V2DI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSH2D,
+RS6000_BTI_V2DI, RS6000_BTI_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSW2D,
+RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
+
   /* Overloaded built-in functions for ISA3.1 (power10). */
   { P10_BUILTIN_VEC_CLRL, P10V_BUILTIN_VCLRLB,
 RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 },
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 0c1b

[PATCH 4/6 ver 3] Add TI to TD (128-bit DFP) and TD to TI support

2021-01-19 Thread Carl Love via Gcc-patches
Will, Segher:
 
This patch adds support for converting to/from 128-bit integers and
128-bit decimal floating point formats.

Version 3:

  No functional changes.
  Tested on Power 8BE, Power9, Power10.

Version 2:
  Updated ChangeLog comments.  Fixed up comments in the test program.

  Re-tested the patch on Power 9 with no regression errors.
   
Carl

---

gcc/ChangeLog

2021-01-12  Carl Love  
* config/rs6000/dfp.md (floattitd2, fixtdti2): New define_insns.
* config/rs6000/rs6000-call.c (P10V_BUILTIN_VCMPNET_P,
P10V_BUILTIN_VCMPAET_P): New overloaded definitions.

gcc/testsuite/ChangeLog

2021-01-12  Carl Love  
* gcc.target/powerpc/int_128bit-runnable.c: Add 128-bit DFP
conversion tests.
---
 gcc/config/rs6000/dfp.md  | 14 +
 .../gcc.target/powerpc/int_128bit-runnable.c  | 61 +++
 2 files changed, 75 insertions(+)

diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
index c8cdb645865..876ab2ed682 100644
--- a/gcc/config/rs6000/dfp.md
+++ b/gcc/config/rs6000/dfp.md
@@ -222,6 +222,13 @@
   "dcffixq %0,%1"
   [(set_attr "type" "dfp")])
 
+(define_insn "floattitd2"
+  [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
+   (float:TD (match_operand:TI 1 "gpc_reg_operand" "v")))]
+  "TARGET_POWER10"
+  "dcffixqq %0,%1"
+  [(set_attr "type" "dfp")])
+
 ;; Convert a decimal64/128 to a decimal64/128 whose value is an integer.
 ;; This is the first stage of converting it to an integer type.
 
@@ -241,6 +248,13 @@
   "TARGET_DFP"
   "dctfix %0,%1"
   [(set_attr "type" "dfp")])
+
+(define_insn "fixtdti2"
+  [(set (match_operand:TI 0 "gpc_reg_operand" "=v")
+   (fix:TI (match_operand:TD 1 "gpc_reg_operand" "d")))]
+  "TARGET_POWER10"
+  "dctfixqq %0,%1"
+  [(set_attr "type" "dfp")])
 
 ;; Decimal builtin support
 
diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
index 3f8892b39d6..42cb91c7ba9 100644
--- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
@@ -38,6 +38,7 @@
 #if DEBUG
 #include 
 #include 
+#include 
 
 
 void print_i128(__int128_t val)
@@ -59,6 +60,13 @@ int main ()
   __int128_t arg1, result;
   __uint128_t uarg2;
 
+  _Decimal128 arg1_dfp128, result_dfp128, expected_result_dfp128;
+
+  struct conv_t {
+__uint128_t u128;
+_Decimal128 d128;
+  } conv, conv2;
+
   vector signed long long int vec_arg1_di, vec_arg2_di;
   vector signed long long int vec_result_di, vec_expected_result_di;
   vector unsigned long long int vec_uarg1_di, vec_uarg2_di, vec_uarg3_di;
@@ -2296,6 +2304,59 @@ int main ()
 abort();
 #endif
   }
+  
+  /* DFP to __int128 and __int128 to DFP conversions */
+  /* Print the DFP value as an unsigned int so we can see the bit patterns.  */
+  conv.u128 = 0x2208ULL;
+  conv.u128 = (conv.u128 << 64) | 0x4ULL;   //DFP bit pattern for integer 4
+  expected_result_dfp128 = conv.d128;
 
+  arg1 = 4;
+
+  conv.d128 = (_Decimal128) arg1;
+
+  result_dfp128 = (_Decimal128) arg1;
+  if (((conv.u128 >>64) != 0x2208ULL) &&
+  ((conv.u128 & 0x) != 0x4ULL)) {
+#if DEBUG
+printf("ERROR:  convert int128 value ");
+print_i128 (arg1);
+conv.d128 = result_dfp128;
+printf("\nto DFP value 0x%llx %llx (printed as hex bit string) ",
+  (unsigned long long)((conv.u128) >>64),
+  (unsigned long long)((conv.u128) & 0x));
+
+conv.d128 = expected_result_dfp128;
+printf("\ndoes not match expected_result = 0x%llx %llx\n\n",
+  (unsigned long long) (conv.u128>>64),
+  (unsigned long long) (conv.u128 & 0x));
+#else
+abort();
+#endif
+  }
+
+  expected_result = 4;
+
+  conv.u128 = 0x2208ULL;
+  conv.u128 = (conv.u128 << 64) | 0x4ULL;  // 4 as DFP
+  arg1_dfp128 = conv.d128;
+
+  result = (__int128_t) arg1_dfp128;
+
+  if (result != expected_result) {
+#if DEBUG
+printf("ERROR:  convert DFP value ");
+printf("0x%llx %llx (printed as hex bit string) ",
+  (unsigned long long)(conv.u128>>64),
+  (unsigned long long)(conv.u128 & 0x));
+printf("to __int128 value = ");
+print_i128 (result);
+printf("\ndoes not match expected_result = ");
+print_i128 (expected_result);
+printf("\n");
+#else
+abort();
+#endif
+  }
   return 0;
 }
-- 
2.27.0





[PATCH 3/6 ver 3] RS6000 add 128-bit Integer Operations part 1

2021-01-19 Thread Carl Love via Gcc-patches
Will, Segher:

This patch adds the 128-bit integer support for divide, modulo, shift,
compare of 128-bit integers instructions and builtin support.

version 3:

  int_128bit-runnable.c: Removed ppc_native_128bit from
 dg-require-effective-target.  Was missed from 
 an earlier cleanup.
  Tested on Power 8BE, Power9, Power10.
   
version 2:

  Fixed the references to 128-bit in ChangeLog that got missed in the
  last go round.

  Fixed missing spaces in emit_insn calls.

  Re-tested the patch on Power 9 with no regression errors.

Carl Love

--

gcc/ChangeLog

2021-01-12  Carl Love  
* config/rs6000/altivec.h (vec_signextq, vec_dive, vec_mod): Add define
for new builtins.
* config/rs6000/altivec.md (UNSPEC_VMULEUD, UNSPEC_VMULESD,
UNSPEC_VMULOUD, UNSPEC_VMULOSD): New unspecs.
(altivec_eqv1ti, altivec_gtv1ti, altivec_gtuv1ti, altivec_vmuleud,
altivec_vmuloud, altivec_vmulesd, altivec_vmulosd, altivec_vrlq,
altivec_vrlqmi, altivec_vrlqmi_inst, altivec_vrlqnm,
altivec_vrlqnm_inst, altivec_vslq, altivec_vsrq, altivec_vsraq,
altivec_vcmpequt_p, altivec_vcmpgtst_p, altivec_vcmpgtut_p): New
define_insn.
(vec_widen_umult_even_v2di, vec_widen_smult_even_v2di,
vec_widen_umult_odd_v2di, vec_widen_smult_odd_v2di, altivec_vrlqmi,
altivec_vrlqnm): New define_expands.
* config/rs6000/rs6000-builtin.def (VCMPEQUT_P, VCMPGTST_P,
VCMPGTUT_P): Add macro expansions.
(BU_P10V_AV_P): Add builtin predicate definition.
(VCMPGTUT, VCMPGTST, VCMPEQUT, CMPNET, CMPGE_1TI,
CMPGE_U1TI, CMPLE_1TI, CMPLE_U1TI, VNOR_V1TI_UNS, VNOR_V1TI, VCMPNET_P,
VCMPAET_P, VSIGNEXTSD2Q, VMULEUD, VMULESD, VMULOUD, VMULOSD, VRLQ,
VSLQ, VSRQ, VSRAQ, VRLQNM, DIV_V1TI, UDIV_V1TI, DIVES_V1TI, DIVEU_V1TI,
MODS_V1TI, MODU_V1TI, VRLQMI): New macro expansions.
(VRLQ, VSLQ, VSRQ, VSRAQ, DIVE, MOD, SIGNEXT): New overload expansions.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VCMPEQUT,
P10V_BUILTIN_CMPGE_1TI, P10V_BUILTIN_CMPGE_U1TI,
P10V_BUILTIN_VCMPGTUT, P10V_BUILTIN_VCMPGTST,
P10V_BUILTIN_CMPLE_1TI, P10V_BUILTIN_VCMPLE_U1TI,
P10V_BUILTIN_DIV_V1TI, P10V_BUILTIN_UDIV_V1TI,
P10V_BUILTIN_VMULESD, P10V_BUILTIN_VMULEUD,
P10V_BUILTIN_VMULOSD, P10V_BUILTIN_VMULOUD,
P10V_BUILTIN_VNOR_V1TI, P10V_BUILTIN_VNOR_V1TI_UNS,
P10V_BUILTIN_VRLQ, P10V_BUILTIN_VRLQMI,
P10V_BUILTIN_VRLQNM, P10V_BUILTIN_VSLQ,
P10V_BUILTIN_VSRQ, P10V_BUILTIN_VSRAQ,
P10V_BUILTIN_VCMPGTUT_P, P10V_BUILTIN_VCMPGTST_P,
P10V_BUILTIN_VCMPEQUT_P, P10V_BUILTIN_VCMPGTUT_P,
P10V_BUILTIN_VCMPGTST_P, P10V_BUILTIN_CMPNET,
P10V_BUILTIN_VCMPNET_P, P10V_BUILTIN_VCMPAET_P,
P10V_BUILTIN_VSIGNEXTSD2Q, P10V_BUILTIN_DIVES_V1TI,
P10V_BUILTIN_MODS_V1TI, P10V_BUILTIN_MODU_V1TI):
New overloaded definitions.
(rs6000_gimple_fold_builtin) [P10V_BUILTIN_VCMPEQUT,
P10_BUILTIN_CMPNET, P10_BUILTIN_CMPGE_1TI,
P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT,
P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI,
P10_BUILTIN_CMPLE_U1TI]: New case statements.
(rs6000_init_builtins) [bool_V1TI_type_node, int_ftype_int_v1ti_v1ti]:
New assignments.
(altivec_init_builtins): New E_V1TImode case statement.
(builtin_function_type)[P10_BUILTIN_128BIT_VMULEUD,
P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_128BIT_DIVEU_V1TI,
P10_BUILTIN_128BIT_MODU_V1TI, P10_BUILTIN_CMPGE_U1TI,
P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPEQUT]: New case statements.
* config/rs6000/r6000.c (rs6000_handle_altivec_attribute)[E_TImode,
E_V1TImode]: New case statements.
* config/rs6000/r6000.h (rs6000_builtin_type_index): New enum
value RS6000_BTI_bool_V1TI.
* config/rs6000/vector.md (vector_gtv1ti,vector_nltv1ti,
vector_gtuv1ti, vector_nltuv1ti, vector_ngtv1ti, vector_ngtuv1ti,
vector_eq_v1ti_p, vector_ne_v1ti_p, vector_ae_v1ti_p,
vector_gt_v1ti_p, vector_gtu_v1ti_p, vrotlv1ti3, vashlv1ti3,
vlshrv1ti3, vashrv1ti3): New define_expands.
* config/rs6000/vsx.md (UNSPEC_VSX_DIVSQ, UNSPEC_VSX_DIVUQ,
UNSPEC_VSX_DIVESQ, UNSPEC_VSX_DIVEUQ, UNSPEC_VSX_MODSQ,
UNSPEC_VSX_MODUQ): New unspecs.
(mulv2di3, vsx_div_v1ti, vsx_udiv_v1ti, vsx_dives_v1ti,
vsx_diveu_v1ti, vsx_mods_v1ti, vsx_modu_v1ti, xxswapd_v1ti,
vsx_sign_extend_v2di_v1ti): New define_insns.
(vcmpnet): New define_expand.
* gcc/doc/extend.texi: Add documentation for the new builtins vec_rl,
vec_rlmi, vec_rlnm, vec_sl, vec_sr, vec_sra, vec_mule, vec_mulo,
vec_div, vec_dive, vec_mod, vec_cmpeq, vec_cmpne, vec_cmpgt, vec_cmplt,
vec_cmp

[PATCH 6/6 ver 3] Conversions between 128-bit integer and floating point values.

2021-01-19 Thread Carl Love via Gcc-patches
Will, Segher:
 
This patch adds support for converting to/from 128-bit integers and
128-bit decimal floating point formats using the new P10 instructions
dcffixqq and dctfixqq.  The new instructions are only used on P10 HW,
otherwise the conversions continue to use the existing SW routines.

The files fixkfti-sw.c and fixunskfti-sw.c are renamed versions of
fixkfti.c and fixunskfti.c respectively.  The function names in the
files were updated with the rename as well as some white spaces fixes.

version 3:  Numerous changes with help/input from Michael Meissner

  Add assembler checks for the 128-bit conversion  instructions, see
  configure and configure.ac.

  Add the libgcc resolvers to select sw or hw support for the
conversions.

  Rename, rewrite the existing conversion files (fixkfti.c,
fixunskfti.c,
  floattikf.c, floatuntikf.c) to create the sw conversion files.

  Tested on Power 8BE, Power9, Power10.

version 2:

  Fixed a typo in the ChangeLog noted by Will.

  Removed the target ppc_native_128bit from the test case as we no
  longer have the 128-bit flag.

  Re-tested the patch on Power 9 with no regression errors.

Carl Love



gcc/ChangeLog

2021-01-15  Carl Love  
* config/rs6000/rs6000.md (floatti2, floatunsti2,
fix_truncti2, fixuns_truncti2): Add
define_insn for mode IEEE 128.

gcc/testsuite/ChangeLog

2021-01-15  Carl Love  
* gcc.target/powerpc/fp128_conversions.c: New file.
* gcc.target/powerpc/int_128bit-runnable.c(vextsd2ppc_native_128bitq,
vcmpuq, vcmpsq, vcmpequq, vcmpequq., vcmpgtsq, vcmpgtsq.
vcmpgtuq, vcmpgtuq.): Update scan-assembler-times.
(ppc_native_128bit): Remove dg-require-effective-target.

libgcc/ChangeLog
2021-01-15  Carl Love  
* config.host: Add if test and set for
libgcc_cv_powerpc_3_1_float128_hw.
* libgcc/config/rs6000/fixkfti.c: Renamed to fixkfti-sw.c.
Change calls of __fixkfti to __fixkfti_sw.
* libgcc/config/rs6000/fixunskfti.c: Renamed to fixunskfti-sw.c.
Change calls of __fixunskfti to __fixunskfti_sw.
* libgcc/config/rs6000/float128-p10.c (__floattikf_hw,
__floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw): New file.
* libgcc/config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1): New macro.
(__floattikf_resolve, __floatuntikf_resolve, __fixkfti_resolve,
__fixunskfti_resolve): Add resolve functions.
(__floattikf, __floatuntikf, __fixkfti, __fixunskfti): New functions.
* libgcc/config/rs6000/float128-sed (floattitf, __floatuntitf,
__fixtfti, __fixunstfti): Add editor commands to change names.
* libgcc/config/rs6000/float128-sed-hw (__floattitf,
__floatuntitf, __fixtfti, __fixunstfti): Add editor commands to
change names.
* libgcc/config/rs6000/floattikf.c: Renamed to floattikf-sw.c.
* libgcc/config/rs6000/floatuntikf.c: Renamed to floatuntikf-sw.c.
* libgcc/config/rs6000/quaad-float128.h (__floattikf_sw,
__floatuntikf_sw, __fixkfti_sw, __fixunskfti_sw, __floattikf_hw,
__floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw, __floattikf,
__floatuntikf, __fixkfti, __fixunskfti): New extern declarations.
* libgcc/config/rs6000/t-float128 (floattikf, floatuntikf,
fixkfti, fixunskfti): Remove file names from fp128_ppc_funcs.
(floattikf-sw, floatuntikf-sw, fixkfti-sw, fixunskfti-sw): Add
file names to fp128_ppc_funcs.
* libgcc/config/rs6000/t-float128-hw(fp128_3_1_hw_funcs,
fp128_3_1_hw_src, fp128_3_1_hw_static_obj, fp128_3_1_hw_shared_obj,
fp128_3_1_hw_obj): Add variables for ISA 3.1 support.
* libgcc/config/rs6000/t-float128-p10-hw: New file.
* configure: Update script for isa 3.1 128-bit float support.
* configure.ac: Add check for 128-bit float hardware support.
---
 gcc/config/rs6000/rs6000.md   |  36 +++
 .../gcc.target/powerpc/fp128_conversions.c| 294 ++
 .../gcc.target/powerpc/int_128bit-runnable.c  |  14 +-
 libgcc/config.host|   4 +
 .../config/rs6000/{fixkfti.c => fixkfti-sw.c} |   4 +-
 .../rs6000/{fixunskfti.c => fixunskfti-sw.c}  |   4 +-
 libgcc/config/rs6000/float128-ifunc.c |  44 ++-
 libgcc/config/rs6000/float128-p10.c   |  71 +
 libgcc/config/rs6000/float128-sed |   4 +
 libgcc/config/rs6000/float128-sed-hw  |   4 +
 .../rs6000/{floattikf.c => floattikf-sw.c}|   4 +-
 .../{floatuntikf.c => floatuntikf-sw.c}   |   4 +-
 libgcc/config/rs6000/quad-float128.h  |  17 +-
 libgcc/config/rs6000/t-float128   |  12 +-
 libgcc/config/rs6000/t-float128-hw|  16 +
 libgcc/config/rs6000/t-float128-p10-hw|  24 ++
 libgcc/configure  |  39 ++-
 libgcc/configure.ac   |  25 ++
 18 files changed

[PATCH 0/6 ver3] RS6000 add 128-bit Integer Operations

2021-01-19 Thread Carl Love via Gcc-patches
Segher, Will:

The following patch set is adds the 128-bit integer operation support
and fixes a bug found in the existing support.  This is the third
version of the patch set.  The first five patches have minor updates
based on previous reviews.  The last patch has a number of functional
changes to get the 128-bit conversion support to use the new hardware
instrucitons on Power 10.  The existing software support is used for
Power 9 and earlier platforms.

  Carl Love



Re: [PATCH] c++: ICE when mangling operator name [PR98545]

2021-01-19 Thread Marek Polacek via Gcc-patches
On Tue, Jan 19, 2021 at 03:47:47PM -0500, Jason Merrill via Gcc-patches wrote:
> On 1/13/21 6:39 PM, Marek Polacek wrote:
> > r11-6301 added some asserts in mangle.c, and now we trip over one of
> > them.  In particular, it's the one asserting that we didn't get
> > IDENTIFIER_ANY_OP_P when mangling an expression with a dependent name.
> > 
> > As this testcase shows, it's possible to get that, so turn the assert
> > into an if and write "on".  That changes the mangling in the following
> > way:
> > 
> > With this patch:
> > 
> > $ c++filt _ZN1i1hIJ1adS1_EEEDTcldtdefpTonclspcvT__EEEDpS2_
> > decltype (((*this).(operator()))((a)(), (double)(), (a)())) i::h > a>(a, double, a)
> > 
> > G++10:
> > $ c++filt _ZN1i1hIJ1adS1_EEEDTcldtdefpTclspcvT__EEEDpS2_
> > decltype (((*this).(operator()))((a)(), (double)(), (a)())) i::h > a>(a, double, a)
> > 
> > clang++/icc:
> > $ c++filt _ZN1i1hIJ1adS1_EEEDTclonclspcvT__EEEDpS2_
> > decltype ((operator())((a)(), (double)(), (a)())) i::h(a, 
> > double, a)
> > 
> > I'm not sure why we differ in the "(*this)." part
> 
> Is there a PR for that?

I just opened 98756, because I didn't find any.  I can investigate where that
(*this) comes from, though it's not readily clear to me if this is a bug or not.

> > but at least the
> > suffix "onclspcvT__EEEDpS2_" is the same for all three compilers.  So
> > I hope the following fix makes sense.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/98545
> > * mangle.c (write_expression): When the expression is a dependent name
> > and an operator name, write "on" before writing its name.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/98545
> > * g++.dg/abi/mangle76.C: New test.
> > ---
> >   gcc/cp/mangle.c |  3 ++-
> >   gcc/testsuite/g++.dg/abi/mangle76.C | 39 +
> >   2 files changed, 41 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/abi/mangle76.C
> > 
> > diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
> > index 11eb8962d28..bb3c4b76d33 100644
> > --- a/gcc/cp/mangle.c
> > +++ b/gcc/cp/mangle.c
> > @@ -3349,7 +3349,8 @@ write_expression (tree expr)
> > else if (dependent_name (expr))
> >   {
> > tree name = dependent_name (expr);
> > -  gcc_assert (!IDENTIFIER_ANY_OP_P (name));
> > +  if (IDENTIFIER_ANY_OP_P (name))
> > +   write_string ("on");
> 
> Any mangling change needs to handle different -fabi-versions; see the
> similar code in write_member_name.

Ah, I only looked at the unguarded IDENTIFIER_ANY_OP_P checks.  But now
I have a possibly stupid question: what version should I check?  We have
Version 11 for which the manual already says "corrects the mangling of
sizeof... expressions and *operator names*", so perhaps I could tag along
and check abi_version_at_least (11).  Or should I check Version 15 and
update the manual?
 
> And why doesn't this go through write_member_name?

We go through write_member_name:

#0  fancy_abort (file=0x2b98ef8 "/home/mpolacek/src/gcc/gcc/cp/mangle.c", 
line=3352, 
function=0x2b99751 "write_expression") at 
/home/mpolacek/src/gcc/gcc/diagnostic.c:1884
#1  0x00bee91b in write_expression (expr=)
at /home/mpolacek/src/gcc/gcc/cp/mangle.c:3352
#2  0x00beb3e2 in write_member_name (member=)
at /home/mpolacek/src/gcc/gcc/cp/mangle.c:2892
#3  0x00beee70 in write_expression (expr=)
at /home/mpolacek/src/gcc/gcc/cp/mangle.c:3405
#4  0x00bef1be in write_expression (expr=)
at /home/mpolacek/src/gcc/gcc/cp/mangle.c:3455
#5  0x00be858a in write_type (type=)
at /home/mpolacek/src/gcc/gcc/cp/mangle.c:2343

so in write_member_name MEMBER is a BASELINK so we don't enter the
identifier_p block.

Marek



New German PO file for 'gcc' (version 10.2.0)

2021-01-19 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

https://translationproject.org/latest/gcc/de.po

(This file, 'gcc-10.2.0.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: driver: do not check input file existence here [PR 98452]

2021-01-19 Thread Joseph Myers
On Tue, 19 Jan 2021, Nathan Sidwell wrote:

> Joseph,
> I was relying on this patch on the modules branch, but didn't realize the
> implications when merging and thought it was just a cleanup.  I'm not sure why
> the driver wants to check here, rather than leave it to the compiler.  Seems
> optimizing for failure? The only difference I can think is that the diagnostic
> might mention the driver name, rather than say (cc1plus), but that's a
> different problem that I've also reported.

What do the error messages look like, before and after this patch, for the 
various cases?  (Response file missing; file handled by e.g. cc1plus 
missing; file handled by the linker missing.)

The check here dates back to commit 
48fb792a91a6b0850d723dc87bcc18eeab7ac3f5 from 1993, so we don't have any 
further explanation of what motivated it.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v2] c++: Add support for -std=c++2b

2021-01-19 Thread Jason Merrill via Gcc-patches

On 1/10/21 7:28 PM, Paul Fee via Gcc-patches wrote:

[PATCH v2] c++: Add support for -std=c++2b


Thanks!

This patch was corrupted by word wrap, so it won't apply; if you can't 
suppress word wrap in your mail client, please send the patch as an 
attachment instead.


Also remember to use git gcc-verify before sending the patch.


Derived from the changes that added C++2a support in 2017.
https://gcc.gnu.org/g:026a79f70cf33f836ea5275eda72d4870a3041e5

No C++2b features are added here.
Use of -std=c++2b sets __cplusplus to 202100L.

$ g++ -std=c++2b -dM -E -x c++ - < /dev/null | grep cplusplus
#define __cplusplus 202100L

Changes since v1 (8th Jan 2021):
* As suggested by Jonathan Wakely:
   __cplusplus set to 202100L rather than 202101L.  Use of a non-existent date
   helps indicate this is not a true standard, yet is a value greater
than 202002L.
* As suggested by Jakub Jelinek:
   Fixed typos and formatting.
   Added C++23 support to dwarf2out.c, including missing C++20 support
in highest_c_language.



* Regarding suggestion by Marek Polacek to refer to C++23 rather than C++2b.
   Left the option as -std=c++2b for now.  It may be premature to assume the 
next
   version of the standard will be named C++23.  Use of c++2b also reinforces
   the experimental nature of GCC's C++23 implementation.


Hmm, I don't think it's that premature; the C++ committee has been very 
serious about time-based releases every three years.  I think it makes 
sense for the advertised flag to be c++2b, but let's also go ahead and 
add the c++23 flags as hidden, and use cxx23 internally.



gcc/

 Add support for -std=c++2b
 * doc/cpp.texi (__cplusplus): Document value for -std=c++2b
 or -std=gnu++2b.
 * doc/invoke.texi: Document -std=c++2b and -std=gnu++2b.

gcc/c-family

 Add support for -std=c++2b
 * c-common.h (cxx_dialect): Add cxx2b as a dialect.
 * c.opt: Add options for -std=c++2b and -std=gnu++2b.
 * c-opts.c (set_std_cxx2b): New.
 (c_common_handle_option): Set options when -std=c++2b is enabled.
 (c_common_post_options): Adjust comments.
 (set_std_cxx20): Likewise.
 * dwarf2out.c (highest_c_language): Recognise C++20 and C++23.
 (gen_compile_unit_die): Recognise C++23.


dwarf2out.c isn't in c-family.


gcc/testsuite

 Add support for -std=c++2b
 * lib/target-supports.exp (check_effective_target_c++2a_only):
 rename to check_effective_target_c++20_only.
 (check_effective_target_c++2a): rename to check_effective_target_c++20.
 (check_effective_target_c++20): Return 1
 if check_effective_target_c++20_only or
 if check_effective_target_c++2b.
 (check_effective_target_c++20_down): New.
 (check_effective_target_c++2a_only): New.
 (check_effective_target_c++2a): New.
 * g++.dg/cpp2b/cplusplus.C: New.

libcpp
 Add support for -std=c++2b
 * include/cpplib.h (c_lang): Add CXX2B and GNUCXX2B.
 * init.c (lang_defaults): Add rows for CXX2B and GNUCXX2B.
 (cpp_init_builtins): Set __cplusplus to 202100L for C++2b.
---
  gcc/c-family/c-common.h|4 ++-
  gcc/c-family/c-opts.c  |   29 ++--
  gcc/c-family/c.opt |8 ++
  gcc/doc/cpp.texi   |7 +++--
  gcc/doc/invoke.texi|   10 
  gcc/dwarf2out.c|7 +
  gcc/testsuite/g++.dg/cpp2b/cplusplus.C |4 +++
  gcc/testsuite/lib/target-supports.exp  |   39 
+++--
  libcpp/include/cpplib.h|3 +-
  libcpp/init.c  |7 +
  10 files changed, 98 insertions(+), 20 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index a65c78f7240..f562cdebf4c 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -738,7 +738,9 @@ enum cxx_dialect {
/* C++17 */
cxx17,
/* C++20 */
-  cxx20
+  cxx20,
+  /* C++2b (C++23?) */
+  cxx2b
  };

  /* The C++ dialect being used. C++98 is the default.  */
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 3cdf41bc6e2..15f120d475d 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -113,6 +113,7 @@ static void set_std_cxx11 (int);
  static void set_std_cxx14 (int);
  static void set_std_cxx17 (int);
  static void set_std_cxx20 (int);
+static void set_std_cxx2b (int);
  static void set_std_c89 (int, int);
  static void set_std_c99 (int);
  static void set_std_c11 (int);
@@ -649,6 +650,12 @@ c_common_handle_option (size_t scode, const char
*arg, HOST_WIDE_INT value,
  set_std_cxx20 (code == OPT_std_c__20 /* ISO */);
break;

+case OPT_std_c__2b:
+case OPT_std_gnu__2b:
+  if (!preprocessing_asm_p)
+set_std_cxx2b (code == OPT_std_c__2b /* ISO */);
+  break;
+
  case OPT_std_c90:
  case OPT_std_iso9899_199409:
if (!preprocessing_asm_p)
@@ -1019,7 +1026,7 @@ c_common_post_options (const char **pfilename)
 

[PATCH] rs6000: Fix rs6000_emit_le_vsx_store (PR98549)

2021-01-19 Thread Segher Boessenkool
One of the advantages of LRA is that you can create new pseudos from it
just fine.  The code in rs6000_emit_le_vsx_store was not aware of this.
This patch changes that, in the process fixing PR98549 (where it is
shown that we do call rs6000_emit_le_vsx_store during LRA, which we
used to assert can not happen).

If this regstraps, I'll commit it tomorrow morning.


Segher


2021-01-19  Segher Boessenkool  

* config/rs6000/rs6000.c (rs6000_emit_le_vsx_store): Change assert.
Adjust comment.  Simplify code.
---
 gcc/config/rs6000/rs6000.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 67681d1..108a527 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -9932,10 +9932,8 @@ rs6000_emit_le_vsx_load (rtx dest, rtx source, 
machine_mode mode)
 void
 rs6000_emit_le_vsx_store (rtx dest, rtx source, machine_mode mode)
 {
-  /* This should never be called during or after LRA, because it does
- not re-permute the source register.  It is intended only for use
- during expand.  */
-  gcc_assert (!lra_in_progress && !reload_completed);
+  /* This should never be called after LRA.  */
+  gcc_assert (can_create_pseudo_p ());
 
   /* Use V2DImode to do swaps of types with 128-bit scalar parts (TImode,
  V1TImode).  */
@@ -9946,7 +9944,7 @@ rs6000_emit_le_vsx_store (rtx dest, rtx source, 
machine_mode mode)
   source = gen_lowpart (V2DImode, source);
 }
 
-  rtx tmp = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (source) : source;
+  rtx tmp = gen_reg_rtx_and_attrs (source);
   rs6000_emit_le_vsx_permute (tmp, source, mode);
   rs6000_emit_le_vsx_permute (dest, tmp, mode);
 }
-- 
1.8.3.1



[PATCH] avoid -Warray-bounds checks for vtable assignments (PR 98266)

2021-01-19 Thread Martin Sebor via Gcc-patches

Similar to the problem reported for -Wstringop-overflow in pr98266
and already fixed, -Warray-bounds is also susceptible to false
positives in assignments and copies involving virtual inheritance.
Because the two warnings don't share code yet (hopefully in GCC 12)
the attached patch adds its own workaround for this problem to
gimple-array-bounds.cc, this one slightly more crude because of
the limited insight the array bounds checking has into the checked
expressions.

Tested on x86_64-linux.

Martin
PR middle-end/98266 - bogus array subscript is partly outside array bounds on virtual inheritance

gcc/ChangeLog:

	PR middle-end/98266
	* gimple-array-bounds.cc (array_bounds_checker::check_array_bounds):
	Avoid checking references involving artificial members.

gcc/testsuite/ChangeLog:

	PR middle-end/98266
	* g++.dg/warn/Warray-bounds-15.C: New test.

diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 2576556f76b..413deacece4 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -911,8 +911,16 @@ array_bounds_checker::check_array_bounds (tree *tp, int *walk_subtree,
   else if (TREE_CODE (t) == ADDR_EXPR)
 {
   checker->check_addr_expr (location, t);
-  *walk_subtree = FALSE;
+  *walk_subtree = false;
 }
+  else if (TREE_CODE (t) == COMPONENT_REF
+	   && TREE_CODE (TREE_OPERAND (t, 0)) == MEM_REF
+	   && DECL_ARTIFICIAL (TREE_OPERAND (t, 1)))
+/* Hack: Skip MEM_REF checking for artificial members to avoid false
+   positives for C++ classes with virtual bases.  See pr98266 and
+   pr97595.  */
+*walk_subtree = false;
+
   /* Propagate the no-warning bit to the outer expression.  */
   if (warned)
 TREE_NO_WARNING (t) = true;
diff --git a/gcc/testsuite/g++.dg/warn/Warray-bounds-15.C b/gcc/testsuite/g++.dg/warn/Warray-bounds-15.C
new file mode 100644
index 000..eb75527dc3d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Warray-bounds-15.C
@@ -0,0 +1,30 @@
+/* PR middle-end/98266 - bogus array subscript is partly outside array
+   bounds on virtual inheritance
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+#if __cplusplus < 201103L
+#  define noexcept   throw ()
+#endif
+
+struct A
+{
+  virtual ~A () noexcept;
+  const char* s;
+};
+
+struct B: virtual A { };
+struct C: virtual B { };
+struct D: virtual A { };  // { dg-bogus "\\\[-Warray-bounds" }
+
+struct E: virtual B, virtual D
+{
+  E (const char*);
+};
+
+void f (E);
+
+void g ()
+{
+  f (E (""));
+}


[committed] dwarf2out: reset generation count in toplev::finalize [PR98751]

2021-01-19 Thread David Malcolm via Gcc-patches
PR debug/98751 reports an issue in which most of libgccjit's tests
fails in DWARF 5 handling with
  `.Ldebug_loc2' is already defined"
asm errors.

The bogus label is being emitted at the 3rd in-process iteration, at:
  31673   ASM_OUTPUT_LABEL (asm_out_file, loc_section_label);
which on the initial iteration emits:

 145   │ .Ldebug_loc0:

on the 2nd iteration:
 145   │ .Ldebug_loc1:

and on the 3rd iteration:
 145   │ .Ldebug_loc2:

which is a duplicate of a label emitted earlier:
 138   │ .section.debug_loclists,"",@progbits
 139   │ .long   .Ldebug_loc3-.Ldebug_loc2
 140   │ .Ldebug_loc2:
 141   │ .value  0x5
 142   │ .byte   0x8
 143   │ .byte   0
 144   │ .long   0
 145   │ .Ldebug_loc2:

The issue seems to be that init_sections_and_labels creates the label
  ASM_GENERATE_INTERNAL_LABEL (loc_section_label, DEBUG_LOC_SECTION_LABEL,
   generation);

where "generation" is a static local to init_sections_and_labels that
increments, and thus eventually hits the duplicate value.

It appears that this value is intended to be either 0 or 1, but in
the libgccjit case the compilation code can be invoked an arbitrary
number of times in-process, and hence can eventually lead to a
label name collision.

This patch adds code to dwarf2out_c_finalize (called by
toplev::finalize in libgccjit) to reset the generation counts,
fixing the issue.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Takes jit.sum from:
  FAIL: 70->0 (-70)
  PASS: 5734->11979 (+6245)

Given that Jakub independently came up with a near-identical patch
I've gone ahead and pushed this to master as
r11-6808-gb83604c75fee324cc4767d039178cba2fbbe017e.

gcc/ChangeLog:
PR debug/98751
* dwarf2out.c (output_line_info): Rename static variable
"generation", moving it out of the function to...
(output_line_info_generation): New.
(init_sections_and_labels): Likewise, renaming the variable to...
(init_sections_and_labels_generation): New.
(dwarf2out_c_finalize): Reset the new variables.
---
 gcc/dwarf2out.c | 66 ++---
 1 file changed, 40 insertions(+), 26 deletions(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 8b6890a5097..93e5d15e20a 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -12709,22 +12709,27 @@ output_one_line_info_table (dw_line_info_table *table)
   dw2_asm_output_data (1, DW_LNE_end_sequence, NULL);
 }
 
+static unsigned int output_line_info_generation;
+
 /* Output the source line number correspondence information.  This
information goes into the .debug_line section.  */
 
 static void
 output_line_info (bool prologue_only)
 {
-  static unsigned int generation;
   char l1[MAX_ARTIFICIAL_LABEL_BYTES], l2[MAX_ARTIFICIAL_LABEL_BYTES];
   char p1[MAX_ARTIFICIAL_LABEL_BYTES], p2[MAX_ARTIFICIAL_LABEL_BYTES];
   bool saw_one = false;
   int opc;
 
-  ASM_GENERATE_INTERNAL_LABEL (l1, LINE_NUMBER_BEGIN_LABEL, generation);
-  ASM_GENERATE_INTERNAL_LABEL (l2, LINE_NUMBER_END_LABEL, generation);
-  ASM_GENERATE_INTERNAL_LABEL (p1, LN_PROLOG_AS_LABEL, generation);
-  ASM_GENERATE_INTERNAL_LABEL (p2, LN_PROLOG_END_LABEL, generation++);
+  ASM_GENERATE_INTERNAL_LABEL (l1, LINE_NUMBER_BEGIN_LABEL,
+  output_line_info_generation);
+  ASM_GENERATE_INTERNAL_LABEL (l2, LINE_NUMBER_END_LABEL,
+  output_line_info_generation);
+  ASM_GENERATE_INTERNAL_LABEL (p1, LN_PROLOG_AS_LABEL,
+  output_line_info_generation);
+  ASM_GENERATE_INTERNAL_LABEL (p2, LN_PROLOG_END_LABEL,
+  output_line_info_generation++);
 
   if (!XCOFF_DEBUGGING_INFO)
 {
@@ -28589,6 +28594,10 @@ output_macinfo (const char *debug_line_label, bool 
early_lto_debug)
   macinfo_label_base += macinfo_label_base_adj;
 }
 
+/* As init_sections_and_labels may get called multiple times, have a
+   generation count for labels.  */
+static unsigned init_sections_and_labels_generation;
+
 /* Initialize the various sections and labels for dwarf output and prefix
them with PREFIX if non-NULL.  Returns the generation (zero based
number of times function was called).  */
@@ -28596,10 +28605,6 @@ output_macinfo (const char *debug_line_label, bool 
early_lto_debug)
 static unsigned
 init_sections_and_labels (bool early_lto_debug)
 {
-  /* As we may get called multiple times have a generation count for
- labels.  */
-  static unsigned generation = 0;
-
   if (early_lto_debug)
 {
   if (!dwarf_split_debug_info)
@@ -28634,7 +28639,7 @@ init_sections_and_labels (bool early_lto_debug)
   SECTION_DEBUG | SECTION_EXCLUDE, NULL);
  ASM_GENERATE_INTERNAL_LABEL (debug_skeleton_abbrev_section_label,
   DEBUG_SKELETON_ABBREV_SECTION_LABEL,
-  generation);
+  init_sec

RE: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination of comparisons to dec+compare

2021-01-19 Thread Eugene Rozenfeld via Gcc-patches
Richard,

Can you please commit this patch for me? I don't have write access yet, I'm 
still working on getting copyright assignment/disclaimer signed by my employer.

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Friday, January 15, 2021 3:55 AM
To: Eugene Rozenfeld 
Cc: gabrav...@gmail.com; ja...@gcc.gnu.org; gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination of 
comparisons to dec+compare

On Thu, Jan 14, 2021 at 10:04 PM Eugene Rozenfeld 
 wrote:
>
> I got more feedback for the patch from Gabriel Ravier and Jakub Jelinek in 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D96674&data=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7Cf4b1e41de6b4469fd3bb08d8b94c5a20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637463084866262315%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2Fc77uS%2BNlNGmXwOmK729BByEW0VDiq1HEe8BA7DpI30%3D&reserved=0
>  and re-worked it accordingly.
>
> The changes from the previous patch are:
> 1. Switched the tests to use __attribute__((noipa)) instead of 
> __attribute__((noinline)) .
> 2. Fixed a type in the pattern comment.
> 3. Added :c for top-level bit_ior expression.
> 4. Added :s for the subexpressions.
> 5. Added a pattern for the negated expression:
> x >= y && y != XXX_MIN --> x > y - 1
> and the corresponding tests.
>
> The new patch is attached.

OK.

Thanks,
Richard.

> Eugene
>
> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, January 5, 2021 4:21 AM
> To: Eugene Rozenfeld 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination 
> of comparisons to dec+compare
>
> On Mon, Jan 4, 2021 at 9:50 PM Eugene Rozenfeld 
>  wrote:
> >
> > Ping.
> >
> > -Original Message-
> > From: Eugene Rozenfeld
> > Sent: Tuesday, December 22, 2020 3:01 PM
> > To: Richard Biener ; 
> > gcc-patches@gcc.gnu.org
> > Subject: RE: Optimize combination of comparisons to dec+compare
> >
> > Re-sending my question and re-attaching the patch.
> >
> > Richard, can you please clarify your feedback?
>
> Hmm, OK.
>
> The patch is OK.
>
> Thanks,
> Richard.
>
>
> > Thanks,
> >
> > Eugene
> >
> > -Original Message-
> > From: Gcc-patches  On Behalf Of 
> > Eugene Rozenfeld via Gcc-patches
> > Sent: Tuesday, December 15, 2020 2:06 PM
> > To: Richard Biener 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: [EXTERNAL] Re: Optimize combination of comparisons to
> > dec+compare
> >
> > Richard,
> >
> > > Do we already handle x < y || x <= CST to x <= y - CST?
> >
> > That is an invalid transformation: e.g., consider x=3, y=4, CST=2.
> > Can you please clarify?
> >
> > Thanks,
> >
> > Eugene
> >
> > -Original Message-
> > From: Richard Biener 
> > Sent: Thursday, December 10, 2020 12:21 AM
> > To: Eugene Rozenfeld 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: Optimize combination of comparisons to dec+compare
> >
> > On Thu, Dec 10, 2020 at 1:52 AM Eugene Rozenfeld via Gcc-patches 
> >  wrote:
> > >
> > > This patch adds a pattern for optimizing x < y || x == XXX_MIN to 
> > > x <=
> > > y-1 if y is an integer with TYPE_OVERFLOW_WRAPS.
> >
> > Do we already handle x < y || x <= CST to x <= y - CST?
> > That is, the XXX_MIN case is just a special-case of generic anti-range 
> > testing?  For anti-range testing with signed types we pun to unsigned when 
> > possible.
> >
> > > This fixes pr96674.
> > >
> > > Tested on x86_64-pc-linux-gnu.
> > >
> > > For this function
> > >
> > > bool f(unsigned a, unsigned b)
> > > {
> > > return (b == 0) | (a < b);
> > > }
> > >
> > > the code without the patch is
> > >
> > > test   esi,esi
> > > sete   al
> > > cmpesi,edi
> > > seta   dl
> > > or eax,edx
> > > ret
> > >
> > > the code with the patch is
> > >
> > > subesi,0x1
> > > cmpesi,edi
> > > setae  al
> > > ret
> > >
> > > Eugene
> > >
> > > gcc/
> > > PR tree-optimization/96674
> > > * match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1
> > >
> > > gcc/testsuite
> > > * gcc.dg/pr96674.c: New test.
> > >


Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-19 Thread Hongtao Liu via Gcc-patches
On Wed, Jan 20, 2021 at 12:10 AM Richard Sandiford
 wrote:
>
> Jakub Jelinek via Gcc-patches  writes:
> > On Tue, Jan 19, 2021 at 12:38:47PM +, Richard Sandiford via Gcc-patches 
> > wrote:
> >> > actually only the lower 16bits are needed, the original insn is like
> >> >
> >> > .294.r.ira
> >> > (insn 69 68 70 13 (set (reg:HI 96 [ _52 ])
> >> > (subreg:HI (reg:DI 82 [ var_6.0_1 ]) 0)) "test.c":21:23 76
> >> > {*movhi_internal}
> >> >  (nil))
> >> > (insn 78 75 82 13 (set (reg:V4HI 140 [ _283 ])
> >> > (vec_duplicate:V4HI (truncate:HI (subreg:SI (reg:HI 96 [ _52
> >> > ]) 0 1412 {*vec_dupv4hi}
> >> >  (nil))
> >> >
> >> > .295r.reload
> >> > (insn 69 68 70 13 (set (reg:HI 5 di [orig:96 _52 ] [96])
> >> > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
> >> > {*movhi_internal}
> >> >  (nil))
> >> > (insn 489 75 78 13 (set (reg:SI 22 xmm2 [297])
> >> > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
> >> >  (nil))
> >> > (insn 78 489 490 13 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
> >> > (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297]
> >> > 1412 {*vec_dupv4hi}
> >> >  (nil))
> >> >
> >> > and insn 489 is created by lra/reload which seems ok for the sequence,
> >> > but problemistic with considering the logic of hardreg_cprop.
> >>
> >> It looks OK even with the regcprop behaviour though:
> >>
> >> - insn 69 defines only the low 16 bits of di,
> >> - insn 489 defines only the low 16 bits of xmm2, but copies bits 16-31
> >>   too (with unknown contents)
> >> - insn 78 uses only the low 16 bits of xmm2 (the unknown contents
> >>   introduced by insn 489 are truncated away)
> >>
> >> So where do bits 16-31 become significant?  What goes wrong if they're
> >> not zero?
> >
> > The k0 register is initialized I believe with
> > (insn 20 2 21 2 (set (reg:DI 68 k0 [orig:82 var_6.0_1 ] [82])
> > (mem/c:DI (symbol_ref:DI ("var_6") [flags 0x40]   > 0x7f7babeaaf30 var_6>) [3 var_6+0 S8 A64])) "pr98694.C":21:10 74 
> > {*movdi_internal}
> >  (nil))
> > and so it contains all 64-bits, and then the code sometimes uses all the
> > bits, sometimes just the low 16-bits and sometimes low 32-bits of that
> > value.
> > (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
> > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":27:23 76 
> > {*movhi_internal}
> >  (nil))
> > (insn 74 73 75 12 (set (reg:SI 36 r8 [orig:149 _52 ] [149])
> > (zero_extend:SI (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82]))) 144 
> > {*zero_extendhisi2}
> >  (nil))
> > (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
> > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
> >  (nil))
> > (insn 78 489 490 12 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
> > (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297] 1412 
> > {*vec_dupv4hi}
> >  (expr_list:REG_DEAD (reg:SI 22 xmm2 [297])
> > (nil)))
> > are examples when it uses only the low 16 bits from that, and
> > (insn 487 72 73 12 (set (reg:SI 1 dx [148])
> > (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal}
> >  (nil))
> >
> > (insn 85 84 491 13 (set (reg:SI 37 r9 [orig:86 _11 ] [86])
> > (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":28:14 75 
> > {*movsi_internal}
> >  (nil))
> >
> > (insn 491 85 88 13 (set (reg:SI 3 bx [299])
> > (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal}
> >  (nil))
> > (insn 88 491 89 13 (set (reg:CCNO 17 flags)
> > (compare:CCNO (reg:SI 3 bx [299])
> > (const_int 0 [0]))) 7 {*cmpsi_ccno_1}
> >  (expr_list:REG_DEAD (reg:SI 3 bx [299])
> > (nil)))
> >
> > (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> > (reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
> > {*movsi_internal}
> >  (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
> > (nil)))
> > are examples where it uses low 32-bits from k0.
> > So the
> >  (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> > -(reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
> > {*movsi_internal}
> > - (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
> > +(reg:SI 22 xmm2 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
> > {*movsi_internal}
> > + (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
> >  (nil)))
> > cprop_hardreg change indeed looks bogus, while xmm2 has SImode, it holds
> > only the low 16-bits of the value and has the upper bits undefined, while r9
> > it is replacing had all of the low 32-bits well defined.
>
> Ah, ok, thanks for the extra context.
>
> So AIUI the problem when recording xmm2<-di isn't just:
>
>  [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
>
> but also that:
>
>  [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode)
>
> For example, all registers in this sequence can be part of the same chain:
>

Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg.

2021-01-19 Thread Hongtao Liu via Gcc-patches
On Wed, Jan 20, 2021 at 12:35 PM Hongtao Liu  wrote:
>
> On Wed, Jan 20, 2021 at 12:10 AM Richard Sandiford
>  wrote:
> >
> > Jakub Jelinek via Gcc-patches  writes:
> > > On Tue, Jan 19, 2021 at 12:38:47PM +, Richard Sandiford via 
> > > Gcc-patches wrote:
> > >> > actually only the lower 16bits are needed, the original insn is like
> > >> >
> > >> > .294.r.ira
> > >> > (insn 69 68 70 13 (set (reg:HI 96 [ _52 ])
> > >> > (subreg:HI (reg:DI 82 [ var_6.0_1 ]) 0)) "test.c":21:23 76
> > >> > {*movhi_internal}
> > >> >  (nil))
> > >> > (insn 78 75 82 13 (set (reg:V4HI 140 [ _283 ])
> > >> > (vec_duplicate:V4HI (truncate:HI (subreg:SI (reg:HI 96 [ _52
> > >> > ]) 0 1412 {*vec_dupv4hi}
> > >> >  (nil))
> > >> >
> > >> > .295r.reload
> > >> > (insn 69 68 70 13 (set (reg:HI 5 di [orig:96 _52 ] [96])
> > >> > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
> > >> > {*movhi_internal}
> > >> >  (nil))
> > >> > (insn 489 75 78 13 (set (reg:SI 22 xmm2 [297])
> > >> > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
> > >> >  (nil))
> > >> > (insn 78 489 490 13 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
> > >> > (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297]
> > >> > 1412 {*vec_dupv4hi}
> > >> >  (nil))
> > >> >
> > >> > and insn 489 is created by lra/reload which seems ok for the sequence,
> > >> > but problemistic with considering the logic of hardreg_cprop.
> > >>
> > >> It looks OK even with the regcprop behaviour though:
> > >>
> > >> - insn 69 defines only the low 16 bits of di,
> > >> - insn 489 defines only the low 16 bits of xmm2, but copies bits 16-31
> > >>   too (with unknown contents)
> > >> - insn 78 uses only the low 16 bits of xmm2 (the unknown contents
> > >>   introduced by insn 489 are truncated away)
> > >>
> > >> So where do bits 16-31 become significant?  What goes wrong if they're
> > >> not zero?
> > >
> > > The k0 register is initialized I believe with
> > > (insn 20 2 21 2 (set (reg:DI 68 k0 [orig:82 var_6.0_1 ] [82])
> > > (mem/c:DI (symbol_ref:DI ("var_6") [flags 0x40]   > > 0x7f7babeaaf30 var_6>) [3 var_6+0 S8 A64])) "pr98694.C":21:10 74 
> > > {*movdi_internal}
> > >  (nil))
> > > and so it contains all 64-bits, and then the code sometimes uses all the
> > > bits, sometimes just the low 16-bits and sometimes low 32-bits of that
> > > value.
> > > (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
> > > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":27:23 76 
> > > {*movhi_internal}
> > >  (nil))
> > > (insn 74 73 75 12 (set (reg:SI 36 r8 [orig:149 _52 ] [149])
> > > (zero_extend:SI (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82]))) 144 
> > > {*zero_extendhisi2}
> > >  (nil))
> > > (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
> > > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
> > >  (nil))
> > > (insn 78 489 490 12 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140])
> > > (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297] 1412 
> > > {*vec_dupv4hi}
> > >  (expr_list:REG_DEAD (reg:SI 22 xmm2 [297])
> > > (nil)))
> > > are examples when it uses only the low 16 bits from that, and
> > > (insn 487 72 73 12 (set (reg:SI 1 dx [148])
> > > (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal}
> > >  (nil))
> > >
> > > (insn 85 84 491 13 (set (reg:SI 37 r9 [orig:86 _11 ] [86])
> > > (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":28:14 75 
> > > {*movsi_internal}
> > >  (nil))
> > >
> > > (insn 491 85 88 13 (set (reg:SI 3 bx [299])
> > > (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal}
> > >  (nil))
> > > (insn 88 491 89 13 (set (reg:CCNO 17 flags)
> > > (compare:CCNO (reg:SI 3 bx [299])
> > > (const_int 0 [0]))) 7 {*cmpsi_ccno_1}
> > >  (expr_list:REG_DEAD (reg:SI 3 bx [299])
> > > (nil)))
> > >
> > > (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> > > (reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
> > > {*movsi_internal}
> > >  (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
> > > (nil)))
> > > are examples where it uses low 32-bits from k0.
> > > So the
> > >  (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
> > > -(reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
> > > {*movsi_internal}
> > > - (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
> > > +(reg:SI 22 xmm2 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 
> > > {*movsi_internal}
> > > + (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
> > >  (nil)))
> > > cprop_hardreg change indeed looks bogus, while xmm2 has SImode, it holds
> > > only the low 16-bits of the value and has the upper bits undefined, while 
> > > r9
> > > it is replacing had all of the low 32-bits well defined.
> >
> > Ah, ok, thanks for the extra context.
> >
> > So AIUI th

Re: [PATCH] c++, v2: Fix up potential_constant_expression_1 FOR/WHILE_STMT handling [PR98672]

2021-01-19 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 19, 2021 at 10:54:08PM +0100, Jakub Jelinek via Gcc-patches wrote:
> So like this then if it passes bootstrap/regtest?

Successfully bootstrapped/regtested on x86_64-linux and i686-linux.
> 
> 2021-01-19  Jakub Jelinek  
> 
>   PR c++/98672
>   * constexpr.c (check_for_return_continue_data): Add break_stmt member.
>   (check_for_return_continue): Also look for BREAK_STMT.  Handle 
> SWITCH_STMT
>   by ignoring break_stmt from its body.
>   (potential_constant_expression_1) ,
>   : If the condition isn't constant true, check if
>   the loop body can contain a return stmt.
>   : Adjust check_for_return_continue_data initializer.
>   : If recursion with tf_none is successful, merge
>   *jump_target from the branches - returns with highest priority, breaks
>   or continues lower.  If then branch is potentially constant and
>   doesn't return, check the else branch if it could return, break or
>   continue.
> 
>   * g++.dg/cpp1y/constexpr-98672.C: New test.

Jakub



Re: [Patch] OpenMP/Fortran: Fixes for {use,is}_device_ptr [PR98476]

2021-01-19 Thread Tobias Burnus

On 18.01.21 17:56, Tobias Burnus wrote:


While testing, it turned out that 'is_device_ptr(aa,bb,cc,dd) was
accepted
as only the first list item was checked – giving later an ICE in the ME.

... and as PR fortran/98757 showed, I forgot a 'git add' before committing
the associated testcase, given that I already had locally the following
patch ...

Committed as Rev. r11-6809-gc05cdfb3f6335d55226cef7917a783498aa41244

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
commit c05cdfb3f6335d55226cef7917a783498aa41244
Author: Tobias Burnus 
Date:   Wed Jan 20 08:31:30 2021 +0100

OpenMP/Fortran: Fix gfortran.dg/gomp/is_device_ptr-2.f90

gcc/testsuite/ChangeLog:

PR fortran/98757
PR fortran/98476
* gfortran.dg/gomp/is_device_ptr-2.f90: Fix dg-error.

diff --git a/gcc/testsuite/gfortran.dg/gomp/is_device_ptr-2.f90 b/gcc/testsuite/gfortran.dg/gomp/is_device_ptr-2.f90
index bf498208aa8..7adc6f6e8e1 100644
--- a/gcc/testsuite/gfortran.dg/gomp/is_device_ptr-2.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/is_device_ptr-2.f90
@@ -8,7 +8,7 @@ subroutine abc(cc)
 !$omp target enter data map(to: cc, dd)
 
 !$omp target data use_device_addr(cc) use_device_ptr(dd)
-  !$omp target is_device_ptr(cc, dd)  ! { dg-error "Non-dummy object 'cc' in IS_DEVICE_PTR clause at" }
+  !$omp target is_device_ptr(cc, dd)  ! { dg-error "Non-dummy object 'dd' in IS_DEVICE_PTR clause at" }
 if (cc /= 131 .or. dd /= 484) stop 1
 cc = 44
 dd = 45


[committed] openmp: Don't ICE on detach clause with erroneous decl [PR98742]

2021-01-19 Thread Jakub Jelinek via Gcc-patches
Hi!

Similarly to how we handle erroneous operands to e.g. allocate clause,
this change just removes those clauses instead of accessing TYPE_MAIN_VARIANT
of its type, which doesn't work on error_mark_node.  Also, just for good
measure, bails out if TYPE_NAME is NULL.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2021-01-20  Jakub Jelinek  

PR c++/98742
* semantics.c (finish_omp_clauses) : If
error_operand_p, remove clause without further checking.  Check
for non-NULL TYPE_NAME.

* c-c++-common/gomp/task-detach-2.c: New test.

--- gcc/cp/semantics.c.jj   2021-01-16 22:52:33.608413922 +0100
+++ gcc/cp/semantics.c  2021-01-19 10:53:07.979801786 +0100
@@ -7430,12 +7430,18 @@ finish_omp_clauses (tree clauses, enum c
  remove = true;
  break;
}
+ else if (error_operand_p (t))
+   {
+ remove = true;
+ break;
+   }
  else
{
  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (t));
  if (!type_dependent_expression_p (t)
  && (!INTEGRAL_TYPE_P (type)
  || TREE_CODE (type) != ENUMERAL_TYPE
+ || TYPE_NAME (type) == NULL_TREE
  || (DECL_NAME (TYPE_NAME (type))
  != get_identifier ("omp_event_handle_t"
{
--- gcc/testsuite/c-c++-common/gomp/task-detach-2.c.jj  2021-01-19 
11:07:29.345948289 +0100
+++ gcc/testsuite/c-c++-common/gomp/task-detach-2.c 2021-01-19 
11:06:57.090317518 +0100
@@ -0,0 +1,9 @@
+/* PR c++/98742 */
+/* { dg-do compile } */
+
+void
+foo ()
+{
+#pragma omp task detach(0) /* { dg-error "before numeric constant" } */
+  ;
+}


Jakub



[r11-6787 Regression] FAIL: gfortran.dg/gomp/is_device_ptr-2.f90 -O (test for excess errors) on Linux/x86_64

2021-01-19 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

049bfd186fae9fb764a3ec04acb20d3eaacda7a3 is the first bad commit
commit 049bfd186fae9fb764a3ec04acb20d3eaacda7a3
Author: Tobias Burnus 
Date:   Tue Jan 19 11:57:34 2021 +0100

OpenMP/Fortran: Fixes for {use,is}_device_ptr

caused

FAIL: gfortran.dg/gomp/is_device_ptr-2.f90   -O   (test for errors, line 11)
FAIL: gfortran.dg/gomp/is_device_ptr-2.f90   -O  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-6787/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=gfortran.dg/gomp/is_device_ptr-2.f90 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=gfortran.dg/gomp/is_device_ptr-2.f90 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=gfortran.dg/gomp/is_device_ptr-2.f90 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=gfortran.dg/gomp/is_device_ptr-2.f90 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[PATCH] builtins: Fix up two bugs in access_ref::inform_access [PR98721]

2021-01-19 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch fixes two bugs in the access_ref::inform_access function
(plus some formatting nits).

The first problem is that ref can be various things, e.g. *_DECL, or
SSA_NAME, or IDENTIFIER_NODE.  And allocfn is non-NULL only if ref is
(at least originally) an SSA_NAME initialized to the result of some
allocator function (but not e.g. __builtin_alloca_with_align which is
handled differently).

A few lines above the last hunk of this patch in builtins.c, the code uses
  if (mode == access_read_write || mode == access_write_only)
{
  if (allocfn == NULL_TREE)
{
  if (*offstr)
inform (loc, "at offset %s into destination object %qE of size %s",
offstr, ref, sizestr);
  else
inform (loc, "destination object %qE of size %s", ref, sizestr);
  return;
}

  if (*offstr)
inform (loc,
"at offset %s into destination object of size %s "
"allocated by %qE", offstr, sizestr, allocfn);
  else
inform (loc, "destination object of size %s allocated by %qE",
sizestr, allocfn);
  return;
}
so if allocfn is NULL, it prints whatever ref is, if it is non-NULL,
it prints instead the allocation function.  But strangely the hunk
a few lines below wasn't consistent with that and instead printed the
first form only if DECL_P (ref) and would ICE if ref wasn't a decl but
still allocfn was NULL.  Fixed by making it consistent what the code does
earlier.

Another bug is that the code earlier contains an ugly hack for VLAs and was
assuming that SSA_NAME_IDENTIFIER must be non-NULL on the lhs of
__builtin_alloca_with_align.  While that is likely true for the cases where
the compiler emits this builtin for VLAs (and it will also be true that
the name of the VLA in that case can be taken from that identifier up to the
first .), the builtin is user accessible as the testcase shows, so one can
have any other SSA_NAME in there.  I think it would be better to add some
more reliable way how to identify VLA names corresponding to
__builtin_alloca_with_align allocations, perhaps internal fn or whatever,
but that is beyond the scope of this patch.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-20  Jakub Jelinek  

PR tree-optimization/98721
* builtins.c (access_ref::inform_access): Don't assume
SSA_NAME_IDENTIFIER must be non-NULL.  Print messages about
object whenever allocfn is NULL, rather than only when DECL_P
is true.  Use %qE instead of %qD for that.  Formatting fixes.

* gcc.dg/pr98721-1.c: New test.
* gcc.dg/pr98721-2.c: New test.

--- gcc/builtins.c.jj   2021-01-18 19:07:16.022895507 +0100
+++ gcc/builtins.c  2021-01-19 11:56:52.247070923 +0100
@@ -4414,8 +4414,8 @@ access_ref::inform_access (access_mode m
 MAXREF on which the result is based.  */
   const offset_int orng[] =
{
-offrng[0] - maxref.offrng[0],
-wi::smax (offrng[1] - maxref.offrng[1], offrng[0]),
+ offrng[0] - maxref.offrng[0],
+ wi::smax (offrng[1] - maxref.offrng[1], offrng[0]),
};
 
   /* Add the final PHI's offset to that of each of the arguments
@@ -4493,12 +4493,15 @@ access_ref::inform_access (access_mode m
  /* Strip the SSA_NAME suffix from the variable name and
 recreate an identifier with the VLA's original name.  */
  ref = gimple_call_lhs (stmt);
- ref = SSA_NAME_IDENTIFIER (ref);
- const char *id = IDENTIFIER_POINTER (ref);
- size_t len = strcspn (id, ".$");
- if (!len)
-   len = strlen (id);
- ref = get_identifier_with_length (id, len);
+ if (SSA_NAME_IDENTIFIER (ref))
+   {
+ ref = SSA_NAME_IDENTIFIER (ref);
+ const char *id = IDENTIFIER_POINTER (ref);
+ size_t len = strcspn (id, ".$");
+ if (!len)
+   len = strlen (id);
+ ref = get_identifier_with_length (id, len);
+   }
}
  else
{
@@ -4557,13 +4560,13 @@ access_ref::inform_access (access_mode m
   return;
 }
 
-  if (DECL_P (ref))
+  if (allocfn == NULL_TREE)
 {
   if (*offstr)
-   inform (loc, "at offset %s into source object %qD of size %s",
+   inform (loc, "at offset %s into source object %qE of size %s",
offstr, ref, sizestr);
   else
-   inform (loc, "source object %qD of size %s", ref,  sizestr);
+   inform (loc, "source object %qE of size %s", ref, sizestr);
 
   return;
 }
--- gcc/testsuite/gcc.dg/pr98721-1.c.jj 2021-01-19 12:15:03.825600828 +0100
+++ gcc/testsuite/gcc.dg/pr98721-1.c2021-01-19 12:14:24.730045488 +0100
@@ -0,0 +1,14 @@
+/* PR tree-optimization/98721 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo