Re: [PATCH] LoongArch: Pass cache information to optimizer

2022-09-26 Thread Lulu Cheng
This change may have to wait for the test results to determine whether 
to merge.


在 2022/9/26 下午2:58, Xi Ruoyao 写道:

Currently our cache information from -mtune is not really used, pass it
to the optimizer so it will be really in-effect.

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_option_override_internal): Set the corresponding
params for L1D cache line size, L1D cache size, and L2D cache
size.
---
  gcc/config/loongarch/loongarch.cc | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 98c0e26cdb9..81594cf5b98 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "context.h"
  #include "builtins.h"
  #include "rtl-iter.h"
+#include "opts.h"
  
  /* This file should be included last.  */

  #include "target-def.h"
@@ -6096,6 +6097,16 @@ loongarch_option_override_internal (struct gcc_options 
*opts)
if (loongarch_branch_cost == 0)
  loongarch_branch_cost = loongarch_cost->branch_cost;
  
+  const loongarch_cache &tune_cache =

+loongarch_cpu_cache[la_target.cpu_tune];
+
+  SET_OPTION_IF_UNSET (opts, &global_options_set, param_l1_cache_line_size,
+  tune_cache.l1d_line_size);
+  SET_OPTION_IF_UNSET (opts, &global_options_set, param_l1_cache_size,
+  tune_cache.l1d_size);
+  SET_OPTION_IF_UNSET (opts, &global_options_set, param_l2_cache_size,
+  tune_cache.l2d_size);
+
if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib)
  error ("%qs cannot be used for compiling a shared library",
   "-mdirect-extern-access");




Re: [PATCH v2] testsuite: Skip intrinsics test if arm

2022-09-26 Thread Christophe Lyon via Gcc-patches

Hi,


On 9/23/22 19:24, Richard Sandiford via Gcc-patches wrote:

Torbjörn SVENSSON via Gcc-patches  writes:

In the test cases, it's clearly written that intrinsics is not
implemented on arm*. A simple xfail does not help since there are
link error and that would cause an UNRESOLVED testcase rather than
XFAIL.
By changing to dg-skip-if, the entire test case is omitted.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Replace
dg-xfail-if with gd-skip-if.




Since Kyrill explicitly added the dg-xfail for arm in 
r8-6382-gda1f8d7f12c2ef , I am not sure he is OK with making the failure 
disappear?


Christophe


Typo: s/gd/dg/

OK with that change, thanks.

Richard


* gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Likewise.

Co-Authored-By: Yvan ROUX  
Signed-off-by: Torbjörn SVENSSON  
---
  gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c | 2 +-
  gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c | 2 +-
  gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c | 2 +-
  3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
index 92a139bc523..f933102be47 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
@@ -1,6 +1,6 @@
  /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
  /* { dg-do run } */
+/* { dg-skip-if "unsupported" { arm*-*-* } } */
  /* { dg-options "-O3" } */
  
  #include 

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
index 6ddd507d9cf..b20dec061b5 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
@@ -1,6 +1,6 @@
  /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
  /* { dg-do run } */
+/* { dg-skip-if "unsupported" { arm*-*-* } } */
  /* { dg-options "-O3" } */
  
  #include 

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
index 451a0afc6aa..e59f845880e 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
@@ -1,6 +1,6 @@
  /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
  /* { dg-do run } */
+/* { dg-skip-if "unsupported" { arm*-*-* } } */
  /* { dg-options "-O3" } */
  
  #include 


[PATCH][pushed] ranger: remove unused function

2022-09-26 Thread Martin Liška
gcc/ChangeLog:

* value-range.cc (tree_compare): Remove unused function.
---
 gcc/value-range.cc | 9 -
 1 file changed, 9 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 9ca442478c9..754379add19 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -258,15 +258,6 @@ frange::accept (const vrange_visitor &v) const
   v.visit (*this);
 }
 
-// Helper function to compare floats.  Returns TRUE if op1 .CODE. op2
-// is nonzero.
-
-static inline bool
-tree_compare (tree_code code, tree op1, tree op2)
-{
-  return !integer_zerop (fold_build2 (code, integer_type_node, op1, op2));
-}
-
 // Flush denormal endpoints to the appropriate 0.0.
 
 void
-- 
2.37.3



Re: [PATCH] c++: Don't quote nothrow in diagnostic

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, Sep 23, 2022 at 8:41 PM Marek Polacek via Gcc-patches
 wrote:
>
> In 
> Jason noticed that we quote "nothrow" in diagnostics even though it's
> not a keyword in C++.  Just removing the quotes didn't work because
> then -Wformat-diag complains, so this patch replaces it with "no-throw".
>
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

That doesn't look like an improvement to me.  Can we quote 'nothrow()' instead?
I'd rather leave it alone than changing it to no-throw.  Why does -Wformat-diag
complain?  If we shouldn't quote nothrow that should be adjusted?

>
> gcc/cp/ChangeLog:
>
> * constraint.cc (diagnose_trait_expr): Say "no-throw" (without quotes)
> rather than "nothrow" in quotes.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/cpp2a/concepts-traits3.C: Adjust expected diagnostics.
> ---
>  gcc/cp/constraint.cc  | 14 +++---
>  gcc/testsuite/g++.dg/cpp2a/concepts-traits3.C |  8 
>  2 files changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 5839bfb4b52..136647f7c9e 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3592,13 +3592,13 @@ diagnose_trait_expr (tree expr, tree args)
>switch (TRAIT_EXPR_KIND (expr))
>  {
>  case CPTK_HAS_NOTHROW_ASSIGN:
> -  inform (loc, "  %qT is not % copy assignable", t1);
> +  inform (loc, "  %qT is not no-throw copy assignable", t1);
>break;
>  case CPTK_HAS_NOTHROW_CONSTRUCTOR:
> -  inform (loc, "  %qT is not % default constructible", t1);
> +  inform (loc, "  %qT is not no-throw default constructible", t1);
>break;
>  case CPTK_HAS_NOTHROW_COPY:
> -  inform (loc, "  %qT is not % copy constructible", t1);
> +  inform (loc, "  %qT is not no-throw copy constructible", t1);
>break;
>  case CPTK_HAS_TRIVIAL_ASSIGN:
>inform (loc, "  %qT is not trivially copy assignable", t1);
> @@ -3674,7 +3674,7 @@ diagnose_trait_expr (tree expr, tree args)
>inform (loc, "  %qT is not trivially assignable from %qT", t1, t2);
>break;
>  case CPTK_IS_NOTHROW_ASSIGNABLE:
> -  inform (loc, "  %qT is not % assignable from %qT", t1, t2);
> +  inform (loc, "  %qT is not no-throw assignable from %qT", t1, t2);
>break;
>  case CPTK_IS_CONSTRUCTIBLE:
>if (!t2)
> @@ -3690,9 +3690,9 @@ diagnose_trait_expr (tree expr, tree args)
>break;
>  case CPTK_IS_NOTHROW_CONSTRUCTIBLE:
>if (!t2)
> -   inform (loc, "  %qT is not % default constructible", t1);
> +   inform (loc, "  %qT is not no-throw default constructible", t1);
>else
> -   inform (loc, "  %qT is not % constructible from %qE", t1, 
> t2);
> +   inform (loc, "  %qT is not no-throw constructible from %qE", t1, t2);
>break;
>  case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS:
>inform (loc, "  %qT does not have unique object representations", t1);
> @@ -3701,7 +3701,7 @@ diagnose_trait_expr (tree expr, tree args)
>inform (loc, "  %qT is not convertible from %qE", t2, t1);
>break;
>  case CPTK_IS_NOTHROW_CONVERTIBLE:
> -   inform (loc, "  %qT is not % convertible from %qE", t2, t1);
> +   inform (loc, "  %qT is not no-throw convertible from %qE", t2, t1);
>break;
>  case CPTK_REF_CONSTRUCTS_FROM_TEMPORARY:
>inform (loc, "  %qT is not a reference that binds to a temporary "
> diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-traits3.C 
> b/gcc/testsuite/g++.dg/cpp2a/concepts-traits3.C
> index f20608b6918..6ac849d71fd 100644
> --- a/gcc/testsuite/g++.dg/cpp2a/concepts-traits3.C
> +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-traits3.C
> @@ -21,7 +21,7 @@ concept TriviallyAssignable = __is_trivially_assignable(T, 
> U);
>
>  template
>  concept NothrowAssignable = __is_nothrow_assignable(T, U);
> -// { dg-message "'S' is not 'nothrow' assignable from 'int'" "" { target 
> *-*-* } .-1  }
> +// { dg-message "'S' is not no-throw assignable from 'int'" "" { target 
> *-*-* } .-1  }
>
>  template
>  concept Constructible = __is_constructible(T, Args...);
> @@ -37,9 +37,9 @@ concept TriviallyConstructible = 
> __is_trivially_constructible(T, Args...);
>
>  template
>  concept NothrowConstructible = __is_nothrow_constructible(T, Args...);
> -// { dg-message "'S' is not 'nothrow' default constructible" "" { target 
> *-*-* } .-1  }
> -// { dg-message "'S' is not 'nothrow' constructible from 'int'" "" { target 
> *-*-* } .-2  }
> -// { dg-message "'S' is not 'nothrow' constructible from 'int, char'" "" { 
> target *-*-* } .-3  }
> +// { dg-message "'S' is not no-throw default constructible" "" { target 
> *-*-* } .-1  }
> +// { dg-message "'S' is not no-throw constructible from 'int'" "" { target 
> *-*-* } .-2  }
> +// { dg-message "'S' is not no-throw constructible from 'int, char'" "" { 
> target *-*-* } .-3  }

Re: [PATCH] Fix profile count comparison.

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, Sep 23, 2022 at 8:53 PM Eugene Rozenfeld via Gcc-patches
 wrote:
>
> The comparison was incorrect when the counts weren't PRECISE.
> For example, crossmodule-indir-call-topn-1.c was failing
> with AutoFDO: when count_sum is 0 with quality AFDO,
> count_sum > profile_count::zero() evaluates to true. Taking that
> branch then leads to an assert in the call to to_sreal().
>
> Tested on x86_64-pc-linux-gnu.

OK

> gcc/ChangeLog:
>
> * ipa-cp.cc (good_cloning_opportunity_p): Fix profile count 
> comparison.
> ---
>  gcc/ipa-cp.cc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
> index 543a9334e2c..66bba71c068 100644
> --- a/gcc/ipa-cp.cc
> +++ b/gcc/ipa-cp.cc
> @@ -3338,9 +3338,9 @@ good_cloning_opportunity_p (struct cgraph_node *node, 
> sreal time_benefit,
>
>ipa_node_params *info = ipa_node_params_sum->get (node);
>int eval_threshold = opt_for_fn (node->decl, param_ipa_cp_eval_threshold);
> -  if (count_sum > profile_count::zero ())
> +  if (count_sum.nonzero_p ())
>  {
> -  gcc_assert (base_count > profile_count::zero ());
> +  gcc_assert (base_count.nonzero_p ());
>sreal factor = count_sum.probability_in (base_count).to_sreal ();
>sreal evaluation = (time_benefit * factor) / size_cost;
>evaluation = incorporate_penalties (node, info, evaluation);
> --
> 2.25.1


Re: [RFA] Minor improvement to coremark, avoid unconditional jump to return

2022-09-26 Thread Richard Biener via Gcc-patches
On Sun, Sep 25, 2022 at 6:29 PM Jeff Law  wrote:
>
> This is a minor improvement for the core_list_find routine in coremark.
>
>
> Basically for riscv, and likely other targets, we can end up with an
> unconditional jump to a return statement.This is a result of
> compensation code created by bb-reorder, and no jump optimization pass
> runs after bb-reorder to clean this stuff up.
>
> This patch utilizes preexisting code to identify suitable branch targets
> as well as preexisting code to emit a suitable return, so it's pretty
> simple.  Note that when we arrange to do this optimization, the original
> return block may become unreachable. So we conditionally call
> delete_unreachable_blocks to fix that up.
>
> This triggers ~160 times during an x86_64 bootstrap.  Naturally it
> bootstraps and regression tests on x86_64.
>
> I've also bootstrapped this on riscv64, regression testing with qemu
> shows some regressions, but AFAICT they're actually qemu bugs with
> signal handling/delivery -- qemu user mode emulation is not consistently
> calling user defined signal handlers.  Given the same binary, sometimes
> they'll get called and the test passes, other times the handler isn't
> called and the test (of course) fails. I'll probably spend some time to
> try and chase this down for the sake of making testing easier.
>
>
> OK for the trunk?

OK.

Thanks,
Richard.

>
>
> Jeff
>
>
>
>


Re: [PATCH] Avoid depending on destructor order

2022-09-26 Thread Rainer Orth
Hi Jeff,

>>> Thanks for the patch.  I'll let you and Jason decide which style solution
>>> is preferred.
>> This also breaks bootstrap on Darwin at least, so an early solution would be
>> welcome (the fix here allows bootstrap to continue, testing on-going).
>> thanks,
>
> I'm using it in the automated tester as well -- without all the *-elf
> targets would fail to build libgcc.

things are even worse on targets that lack constructor priority support,
like Solaris 11.3 and Mac OS X 10.7/Darwin 11:

In file included from 
/vol/gcc/src/hg/master/local/libgcc/unwind-dw2-fde-dip.c:97:
/vol/gcc/src/hg/master/local/libgcc/unwind-dw2-fde.c:54:1: error: destructor 
priorities are not supported
   54 | release_registered_frames (void) __attribute__ ((destructor (110)));
  | ^

This is already checked for in libgcc/configure, and the situation
handled in libgcc/config/i386/cpuinfo.c.  The following patch unbroke
bootstrap on both affected targets and I saw no apparent regressions.
However, I cannot tell if the destructor priority is actually required
for correctness.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
--- a/libgcc/unwind-dw2-fde.c
+++ b/libgcc/unwind-dw2-fde.c
@@ -47,11 +47,17 @@ typedef __UINTPTR_TYPE__ uintptr_type;
 #ifdef ATOMIC_FDE_FAST_PATH
 #include "unwind-dw2-btree.h"
 
+#ifdef HAVE_INIT_PRIORITY
+#define DESTRUCTOR_PRIORITY (110)
+#else
+#define DESTRUCTOR_PRIORITY
+#endif
+
 static struct btree registered_frames;
 static bool in_shutdown;
 
 static void
-release_registered_frames (void) __attribute__ ((destructor (110)));
+release_registered_frames (void) __attribute__ ((destructor DESTRUCTOR_PRIORITY));
 static void
 release_registered_frames (void)
 {


RE: [PATCH 1/2]middle-end Fold BIT_FIELD_REF and Shifts into BIT_FIELD_REFs alone

2022-09-26 Thread Richard Biener via Gcc-patches
On Mon, 26 Sep 2022, Tamar Christina wrote:

> > -Original Message-
> > From: Andrew Pinski 
> > Sent: Saturday, September 24, 2022 8:57 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de
> > Subject: Re: [PATCH 1/2]middle-end Fold BIT_FIELD_REF and Shifts into
> > BIT_FIELD_REFs alone
> > 
> > On Fri, Sep 23, 2022 at 4:43 AM Tamar Christina via Gcc-patches  > patc...@gcc.gnu.org> wrote:
> > >
> > > Hi All,
> > >
> > > This adds a match.pd rule that can fold right shifts and
> > > bit_field_refs of integers into just a bit_field_ref by adjusting the
> > > offset and the size of the extract and adds an extend to the previous 
> > > size.
> > >
> > > Concretely turns:
> > >
> > > #include 
> > >
> > > unsigned int foor (uint32x4_t x)
> > > {
> > > return x[1] >> 16;
> > > }
> > >
> > > which used to generate:
> > >
> > >   _1 = BIT_FIELD_REF ;
> > >   _3 = _1 >> 16;
> > >
> > > into
> > >
> > >   _4 = BIT_FIELD_REF ;
> > >   _2 = (unsigned int) _4;
> > >
> > > I currently limit the rewrite to only doing it if the resulting
> > > extract is in a mode the target supports. i.e. it won't rewrite it to
> > > extract say 13-bits because I worry that for targets that won't have a
> > > bitfield extract instruction this may be a de-optimization.
> > 
> > It is only a de-optimization for the following case:
> > * vector extraction
> > 
> > All other cases should be handled correctly in the middle-end when
> > expanding to RTL because they need to be handled for bit-fields anyways.
> > Plus SIGN_EXTRACT and ZERO_EXTRACT would be used in the integer case
> > for the RTL.
> > Getting SIGN_EXTRACT/ZERO_EXTRACT early on in the RTL is better than
> > waiting until combine really.
> > 
> 
> Fair enough, I've dropped the constraint.
> 
> > 
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > > and no issues.
> > >
> > > Testcase are added in patch 2/2.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > * match.pd: Add bitfield and shift folding.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > >
> > 1d407414bee278c64c00d425d9f025c1c58d853d..b225d36dc758f1581502c8d03
> > 761
> > > 544bfd499c01 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -7245,6 +7245,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >&& ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P
> > (TREE_TYPE(@0)))
> > >(IFN_REDUC_PLUS_WIDEN @0)))
> > >
> > > +/* Canonicalize BIT_FIELD_REFS and shifts to BIT_FIELD_REFS.  */
> > > (for
> > > +shift (rshift)
> > > + op (plus)

why have a for when you only iterate over a single operation?!  And 'op'
seems unused?

> > > + (simplify
> > > +  (shift (BIT_FIELD_REF @0 @1 @2) integer_pow2p@3)
> > > +  (if (INTEGRAL_TYPE_P (type))
> > > +   (with { /* Can't use wide-int here as the precision differs between
> > > + @1 and @3.  */
> > > +  unsigned HOST_WIDE_INT size = tree_to_uhwi (@1);
> > > +  unsigned HOST_WIDE_INT shiftc = tree_to_uhwi (@3);

But you should then test tree_fits_uhwi_p.

> > > +  unsigned HOST_WIDE_INT newsize = size - shiftc;
> > > +  tree nsize = wide_int_to_tree (bitsizetype, newsize);
> > > +  tree ntype
> > > += build_nonstandard_integer_type (newsize, 1); }

build_nonstandard_integer_type never fails so I don't see how
you "limit" this to extractions fitting a mode.

I'm quite sure this breaks with BYTES_BIG_ENDIAN.  Please try
BIT_FIELD_REF _offsets_ that make the extraction cross byte
boundaries.

Also I'm missing a testcase?

Thanks,
Richard.

> > Maybe use `build_nonstandard_integer_type (newsize, /* unsignedp = */
> > true);` or better yet `build_nonstandard_integer_type (newsize,
> > UNSIGNED);`
> 
> Ah, will do,
> Tamar.
> 
> > 
> > I had started to convert some of the unsignedp into enum signop but I never
> > finished or submitted the patch.
> > 
> > Thanks,
> > Andrew Pinski
> > 
> > 
> > > +(if (ntype)
> > > + (convert:type (BIT_FIELD_REF:ntype @0 { nsize; } (op @2
> > > + @3
> > > +
> > >  (simplify
> > >   (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
> > >   (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4);
> > > }))
> > >
> > >
> > >
> > >
> > > --
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


RE: [PATCH v2] testsuite: Skip intrinsics test if arm

2022-09-26 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Christophe Lyon 
> Sent: Monday, September 26, 2022 8:41 AM
> To: Torbjörn SVENSSON via Gcc-patches ;
> Torbjörn SVENSSON ; Richard Sandiford
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH v2] testsuite: Skip intrinsics test if arm
> 
> Hi,
> 
> 
> On 9/23/22 19:24, Richard Sandiford via Gcc-patches wrote:
> > Torbjörn SVENSSON via Gcc-patches  writes:
> >> In the test cases, it's clearly written that intrinsics is not
> >> implemented on arm*. A simple xfail does not help since there are
> >> link error and that would cause an UNRESOLVED testcase rather than
> >> XFAIL.
> >> By changing to dg-skip-if, the entire test case is omitted.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Replace
> >>dg-xfail-if with gd-skip-if.
> >
> 
> Since Kyrill explicitly added the dg-xfail for arm in
> r8-6382-gda1f8d7f12c2ef , I am not sure he is OK with making the failure
> disappear?

Thanks for checking. These intrinsics are not implemented in arm, but they 
should be. We just never got around to adding the support.
I don't know if we have a recommendation for how to mark such cases.
So I'm okay witch changing this to dg-skip-if, but perhaps we should change the 
message to "unimplemented" rather than "unsupported"?
Thanks,
Kyrill

> 
> Christophe
> 
> > Typo: s/gd/dg/
> >
> > OK with that change, thanks.
> >
> > Richard
> >
> >>* gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: Likewise.
> >>* gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Likewise.
> >>
> >> Co-Authored-By: Yvan ROUX  
> >> Signed-off-by: Torbjörn SVENSSON  
> >> ---
> >>   gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c | 2 +-
> >>   gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c | 2 +-
> >>   gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c | 2 +-
> >>   3 files changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
> >> index 92a139bc523..f933102be47 100644
> >> --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
> >> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
> >> @@ -1,6 +1,6 @@
> >>   /* We haven't implemented these intrinsics for arm yet.  */
> >> -/* { dg-xfail-if "" { arm*-*-* } } */
> >>   /* { dg-do run } */
> >> +/* { dg-skip-if "unsupported" { arm*-*-* } } */
> >>   /* { dg-options "-O3" } */
> >>
> >>   #include 
> >> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
> >> index 6ddd507d9cf..b20dec061b5 100644
> >> --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
> >> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
> >> @@ -1,6 +1,6 @@
> >>   /* We haven't implemented these intrinsics for arm yet.  */
> >> -/* { dg-xfail-if "" { arm*-*-* } } */
> >>   /* { dg-do run } */
> >> +/* { dg-skip-if "unsupported" { arm*-*-* } } */
> >>   /* { dg-options "-O3" } */
> >>
> >>   #include 
> >> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
> >> index 451a0afc6aa..e59f845880e 100644
> >> --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
> >> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
> >> @@ -1,6 +1,6 @@
> >>   /* We haven't implemented these intrinsics for arm yet.  */
> >> -/* { dg-xfail-if "" { arm*-*-* } } */
> >>   /* { dg-do run } */
> >> +/* { dg-skip-if "unsupported" { arm*-*-* } } */
> >>   /* { dg-options "-O3" } */
> >>
> >>   #include 


Re: [PATCH] Optimize nested permutation to single VEC_PERM_EXPR [PR54346]

2022-09-26 Thread Richard Biener via Gcc-patches
On Mon, Sep 26, 2022 at 8:58 AM Liwei Xu  wrote:
>
> This patch implemented the optimization in PR 54346, which Merges
>
> c = VEC_PERM_EXPR ;
> d = VEC_PERM_EXPR ;
> to
> d = VEC_PERM_EXPR ;
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> tree-ssa/forwprop-19.c fail to pass but I'm not sure whether it
> is ok to removed it.

Looks good, but leave Richard a chance to ask for VLA vector support which
might be trivial to do.

Btw, doesn't this handle the VEC_PERM + VEC_PERM case in
tree-ssa-forwprop.cc:simplify_permutation as well?  Note _that_ does
seem to handle VLA vectors.

Thanks,
Richard.

> gcc/ChangeLog:
>
> PR target/54346
> * match.pd: Merge the index of VCST then generates the new vec_perm.
>
> gcc/testsuite/ChangeLog:
>
> PR target/54346
> * gcc.dg/pr54346.c: New test.
>
> Co-authored-by: liuhongt 
> ---
>  gcc/match.pd   | 41 ++
>  gcc/testsuite/gcc.dg/pr54346.c | 13 +++
>  2 files changed, 54 insertions(+)
>  create mode 100755 gcc/testsuite/gcc.dg/pr54346.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 345bcb701a5..9219b0a10e1 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8086,6 +8086,47 @@ and,
>(minus (mult (vec_perm @1 @1 @3) @2) @4)))
>
>
> +/* (PR54346) Merge
> +   c = VEC_PERM_EXPR ;
> +   d = VEC_PERM_EXPR ;
> +   to
> +   d = VEC_PERM_EXPR ; */
> +
> +(simplify
> + (vec_perm (vec_perm@0 @1 @2 VECTOR_CST@3) @0 VECTOR_CST@4)
> + (with
> +  {
> +if(!TYPE_VECTOR_SUBPARTS (type).is_constant())
> +  return NULL_TREE;
> +
> +tree op0;
> +machine_mode result_mode = TYPE_MODE (type);
> +machine_mode op_mode = TYPE_MODE (TREE_TYPE (@1));
> +int nelts = TYPE_VECTOR_SUBPARTS (type).to_constant();
> +vec_perm_builder builder0;
> +vec_perm_builder builder1;
> +vec_perm_builder builder2 (nelts, nelts, 1);
> +
> +if (!tree_to_vec_perm_builder (&builder0, @3)
> +|| !tree_to_vec_perm_builder (&builder1, @4))
> +  return NULL_TREE;
> +
> +vec_perm_indices sel0 (builder0, 2, nelts);
> +vec_perm_indices sel1 (builder1, 1, nelts);
> +
> +for (int i = 0; i < nelts; i++)
> +  builder2.quick_push (sel0[sel1[i].to_constant()]);
> +
> +vec_perm_indices sel2 (builder2, 2, nelts);
> +
> +if (!can_vec_perm_const_p (result_mode, op_mode, sel2, false))
> +  return NULL_TREE;
> +
> +op0 = vec_perm_indices_to_tree (TREE_TYPE (@4), sel2);
> +  }
> +  (vec_perm @1 @2 { op0; })))
> +
> +
>  /* Match count trailing zeroes for simplify_count_trailing_zeroes in fwprop.
> The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic
> constant which when multiplied by a power of 2 contains a unique value
> diff --git a/gcc/testsuite/gcc.dg/pr54346.c b/gcc/testsuite/gcc.dg/pr54346.c
> new file mode 100755
> index 000..d87dc3a79a5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr54346.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-dse1" } */
> +
> +typedef int veci __attribute__ ((vector_size (4 * sizeof (int;
> +
> +void fun (veci a, veci b, veci *i)
> +{
> +  veci c = __builtin_shuffle (a, b, __extension__ (veci) {1, 4, 2, 7});
> +  *i = __builtin_shuffle (c, __extension__ (veci) { 7, 2, 1, 5 });
> +}
> +
> +/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 3, 6, 0, 0 }" "dse1" } } */
> +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "dse1" } } */
> \ No newline at end of file
> --
> 2.18.2
>


Re: [PATCH 2/2]AArch64 Add support for neg on v1df

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, 23 Sep 2022, Richard Sandiford wrote:

> Tamar Christina  writes:
> >> -Original Message-
> >> From: Richard Sandiford 
> >> Sent: Friday, September 23, 2022 6:04 AM
> >> To: Tamar Christina 
> >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> >> ; Marcus Shawcroft
> >> ; Kyrylo Tkachov 
> >> Subject: Re: [PATCH 2/2]AArch64 Add support for neg on v1df
> >> 
> >> Tamar Christina  writes:
> >> >> -Original Message-
> >> >> From: Richard Sandiford 
> >> >> Sent: Friday, September 23, 2022 5:30 AM
> >> >> To: Tamar Christina 
> >> >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> >> >> ; Marcus Shawcroft
> >> >> ; Kyrylo Tkachov
> >> 
> >> >> Subject: Re: [PATCH 2/2]AArch64 Add support for neg on v1df
> >> >>
> >> >> Tamar Christina  writes:
> >> >> > Hi All,
> >> >> >
> >> >> > This adds support for using scalar fneg on the V1DF type.
> >> >> >
> >> >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >> >> >
> >> >> > Ok for master?
> >> >>
> >> >> Why just this one operation though?  Couldn't we extend iterators
> >> >> like
> >> >> GPF_F16 to include V1DF, avoiding the need for new patterns?
> >> >>
> >> >
> >> > Simply because it's the only one I know how to generate code for.
> >> > I can change GPF_F16 but I don't know under which circumstances we'd
> >> > generate a V1DF for the other operations.
> >> 
> >> We'd do it for things like:
> >> 
> >> __Float64x1_t foo (__Float64x1_t x) { return -x; }
> >> 
> >> if the pattern is available, instead of using subregs.  So one way would 
> >> be to
> >> scan the expand rtl dump for subregs.
> >
> > Ahh yes, I forgot about that ACLE type.
> >
> >> 
> >> If the point is that there is no observable difference between defining 1-
> >> element vector ops and not, except for this one case, then that suggests we
> >> should handle this case in target-independent code instead.  There's no 
> >> point
> >> forcing every target that has V1DF to define a duplicate of the DF neg
> >> pattern.
> >
> > My original approach was to indeed use DF instead of V1DF, however since we
> > do define V1DF I had expected the mode to be somewhat usable.
> >
> > So I'm happy to do whichever one you prefer now that I know how to test it.
> > I can either change my mid-end code, or extend the coverage of V1DF, any 
> > preference? ?
> 
> I don't mind really, as long as we're consistent.  Maybe Richi has an opinion.
> 
> If he doesn't mind either, then I guess it makes sense to define the ops
> as completely as possible (e.g. equivalently to V2SF), although it doesn't
> need to be all in one go.

I don't mind either, we'll see if theres a target vector registers
not overlapping FP regisers at some point, then it probably matters
so it does seem we should support both variants from the middle-end
at least.  If we have some noop-conversion target hook that tells
us this RTL expansion could use a fallback generating subregs
for V1mode modes.

Richard.

> Thanks,
> Richard
> 
> > Tamar
> >
> >> 
> >> Thanks,
> >> Richard
> >> >
> >> > So if it's ok to do so without full test coverage I'm happy to do so...
> >> >
> >> > Tamar.
> >> >
> >> >> Richard
> >> >>
> >> >> >
> >> >> > Thanks,
> >> >> > Tamar
> >> >> >
> >> >> > gcc/ChangeLog:
> >> >> >
> >> >> >   * config/aarch64/aarch64-simd.md (negv1df2): New.
> >> >> >
> >> >> > gcc/testsuite/ChangeLog:
> >> >> >
> >> >> >   * gcc.target/aarch64/simd/addsub_2.c: New test.
> >> >> >
> >> >> > --- inline copy of patch --
> >> >> > diff --git a/gcc/config/aarch64/aarch64-simd.md
> >> >> > b/gcc/config/aarch64/aarch64-simd.md
> >> >> > index
> >> >> >
> >> >>
> >> f4152160084d6b6f34bd69f0ba6386c1ab50f77e..cf8c094bd4b76981cef2dd5dd7
> >> >> b8
> >> >> > e6be0d56101f 100644
> >> >> > --- a/gcc/config/aarch64/aarch64-simd.md
> >> >> > +++ b/gcc/config/aarch64/aarch64-simd.md
> >> >> > @@ -2713,6 +2713,14 @@ (define_insn "neg2"
> >> >> >[(set_attr "type" "neon_fp_neg_")]
> >> >> >  )
> >> >> >
> >> >> > +(define_insn "negv1df2"
> >> >> > + [(set (match_operand:V1DF 0 "register_operand" "=w")
> >> >> > +   (neg:V1DF (match_operand:V1DF 1 "register_operand" "w")))]
> >> >> > +"TARGET_SIMD"
> >> >> > + "fneg\\t%d0, %d1"
> >> >> > +  [(set_attr "type" "neon_fp_neg_d")]
> >> >> > +)
> >> >> > +
> >> >> >  (define_insn "abs2"
> >> >> >   [(set (match_operand:VHSDF 0 "register_operand" "=w")
> >> >> > (abs:VHSDF (match_operand:VHSDF 1 "register_operand"
> >> >> > "w")))] diff --git
> >> >> > a/gcc/testsuite/gcc.target/aarch64/simd/addsub_2.c
> >> >> > b/gcc/testsuite/gcc.target/aarch64/simd/addsub_2.c
> >> >> > new file mode 100644
> >> >> > index
> >> >> >
> >> >>
> >> ..55a7365e897f8af509de953129
> >> >> e0
> >> >> > f516974f7ca8
> >> >> > --- /dev/null
> >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/simd/addsub_2.c
> >> >> > @@ -0,0 +1,22 @@
> >> >> > +/* { dg-do compile } */
> >> >> > +/* { dg-options "-Ofast" } */
> >> >> > +/* { dg-final { c

Re: [Patch] OpenACC: Fix reduction tree-sharing issue [PR106982]

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, Sep 23, 2022 at 5:25 PM Tobias Burnus  wrote:
>
> This fixes a tree-sharing ICE. It seems as if all unshare_expr
> I added were required in this case. The first long testcase is
> based on the real testcase from the OpenACC testsuite, the second
> one is what reduction produced - but I thought some nested reduction
> might be interesting as well; hence, I included both tests.
>
>
> Bootstrapped and regtested on x86-64-gnu-linux w/o offloading.
> OK for mainline and GCC 12?

looks like v1/v2/v3 are now unshared twice and unsharing outgoing is
better done when its used.  That said, please put the unshares
at places where new things are built, that's much clearer.  That means
the 'outgoing' at

gimplify_assign (outgoing, teardown_call, &after_join);

Richard.

> (It gives an ICE with GCC 10 but not with GCC 9; thus,
> more regression-fix backporting would be possible,
> if someone cares.)
>
> Tobias
>
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


[COMMITED] ada: Tune comment of routine for detecting junk names

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Reword comment to avoid repetition between spec and body.

gcc/ada/

* sem_warn.ads (Has_Junk_Name): Reword comment.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_warn.ads | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/sem_warn.ads b/gcc/ada/sem_warn.ads
index 1894f36f4b0..6681e545a35 100644
--- a/gcc/ada/sem_warn.ads
+++ b/gcc/ada/sem_warn.ads
@@ -257,12 +257,9 @@ package Sem_Warn is
--
 
function Has_Junk_Name (E : Entity_Id) return Boolean;
-   --  Return True if the entity name contains any of the following substrings:
-   --discard
-   --dummy
-   --ignore
-   --junk
-   --unused
+   --  Return True if the entity name contains substrings like "junk" or
+   --  "dummy" (see the body for the complete list).
+   --
--  Used to suppress warnings on names matching these patterns. The contents
--  of Name_Buffer and Name_Len are destroyed by this call.
 
-- 
2.25.1



[COMMITED] ada: Remove definition of MAXPATHLEN for ancient MinGW

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Modern MinGW defines MAXPATHLEN in sys/param.h, so better to use it
directly.

gcc/ada/

* mingw32.h: Remove condition definition of MAXPATHLEN; the include
directive for stdlib.h was most likely intended to provide the
MAX_PATH.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/mingw32.h | 6 --
 1 file changed, 6 deletions(-)

diff --git a/gcc/ada/mingw32.h b/gcc/ada/mingw32.h
index bf8577bb1d4..a190d51076f 100644
--- a/gcc/ada/mingw32.h
+++ b/gcc/ada/mingw32.h
@@ -99,10 +99,4 @@ extern UINT __gnat_current_ccs_encoding;
 #define WS2S(str,wstr,len) strncpy(str,wstr,len)
 #endif
 
-#include 
-
-#ifndef MAXPATHLEN
-#define MAXPATHLEN MAX_PATH
-#endif
-
 #endif /* _MINGW32_H */
-- 
2.25.1



[COMMITED] ada: Deconstruct build support for ancient MinGW

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Remove conditional C code for building GNAT with MinGW earlier than 2.0,
which was released in 2007.

gcc/ada/

* adaint.c: Remove conditional #include directives for old MinGW.
* cal.c: Always include winsock.h, since it is part of modern
MinGW.
* cstreams.c: Remove workaround for old MinGW.
* expect.c: Remove conditional #include directive for old MinGW.
* mingw32.h: Remove STD_MINGW and OLD_MINGW declarations.
* sysdep.c: Remove conditional #include directive for old MinGW.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/adaint.c   | 13 ++---
 gcc/ada/cal.c  |  2 --
 gcc/ada/cstreams.c |  8 
 gcc/ada/expect.c   |  8 ++--
 gcc/ada/mingw32.h  | 17 -
 gcc/ada/sysdep.c   |  6 +-
 6 files changed, 5 insertions(+), 49 deletions(-)

diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index 2ae4dedeb2b..199dbe0e405 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -200,11 +200,7 @@ UINT __gnat_current_ccs_encoding;
 #endif
 
 /* wait.h processing */
-#ifdef __MINGW32__
-# if OLD_MINGW
-#  include 
-# endif
-#elif defined (__vxworks) && defined (__RTP__)
+#if defined (__vxworks) && defined (__RTP__)
 # include 
 #elif defined (__Lynx__)
 /* ??? We really need wait.h and it includes resource.h on Lynx.  GCC
@@ -214,7 +210,7 @@ UINT __gnat_current_ccs_encoding;
preventing the inclusion of the GCC header from doing anything.  */
 # define GCC_RESOURCE_H
 # include 
-#elif defined (__PikeOS__)
+#elif defined (__PikeOS__) || defined (__MINGW32__)
 /* No wait() or waitpid() calls available.  */
 #else
 /* Default case.  */
@@ -335,11 +331,6 @@ const char *__gnat_library_template = 
GNAT_LIBRARY_TEMPLATE;
 
 #if defined (__MINGW32__)
 #include "mingw32.h"
-
-#if OLD_MINGW
-#include 
-#endif
-
 #else
 #include 
 #endif
diff --git a/gcc/ada/cal.c b/gcc/ada/cal.c
index e1ab6922b89..09bcc15c4b3 100644
--- a/gcc/ada/cal.c
+++ b/gcc/ada/cal.c
@@ -53,10 +53,8 @@
 
 #ifdef __MINGW32__
 #include "mingw32.h"
-#if STD_MINGW
 #include 
 #endif
-#endif
 
 void
 __gnat_timeval_to_duration (struct timeval *t, long long *sec, long *usec)
diff --git a/gcc/ada/cstreams.c b/gcc/ada/cstreams.c
index 10cc3a6faf8..fc583e17004 100644
--- a/gcc/ada/cstreams.c
+++ b/gcc/ada/cstreams.c
@@ -97,14 +97,6 @@ extern "C" {
 #undef fileno
 #endif
 
-/* The _IONBF value in MINGW32 stdio.h is wrong.  */
-#if defined (WINNT) || defined (_WINNT)
-#if OLD_MINGW
-#undef _IONBF
-#define _IONBF 0004
-#endif
-#endif
-
 int
 __gnat_feof (FILE *stream)
 {
diff --git a/gcc/ada/expect.c b/gcc/ada/expect.c
index b1889feff37..48fb1076e91 100644
--- a/gcc/ada/expect.c
+++ b/gcc/ada/expect.c
@@ -42,17 +42,13 @@
 #include "adaint.h"
 #include 
 
-#ifdef __MINGW32__
-# if OLD_MINGW
-#  include 
-# endif
-#elif defined (__vxworks) && defined (__RTP__)
+#if defined (__vxworks) && defined (__RTP__)
 # include 
 #elif defined (__Lynx__)
   /* ??? See comment in adaint.c.  */
 # define GCC_RESOURCE_H
 # include 
-#elif defined (__PikeOS__)
+#elif defined (__PikeOS__) || defined (__MINGW32__)
   /* No wait.h available */
 #else
 #include 
diff --git a/gcc/ada/mingw32.h b/gcc/ada/mingw32.h
index 1157fc68018..bf8577bb1d4 100644
--- a/gcc/ada/mingw32.h
+++ b/gcc/ada/mingw32.h
@@ -101,23 +101,6 @@ extern UINT __gnat_current_ccs_encoding;
 
 #include 
 
-/* STD_MINGW: standard if MINGW32 version > 1.3, we have switched to this
-   version instead of the previous enhanced version to ease building GNAT on
-   Windows platforms. By using STD_MINGW or OLD_MINGW it is possible to build
-   GNAT using both MingW include files (Old MingW + ACT changes and standard
-   MingW starting with version 1.3.
-   For w64 Mingw the define STD_MINGW is always set to value 1, because
-   there is no old header set present.  */
-#ifdef _WIN64
-#define STD_MINGW 1
-#else
-#define STD_MINGW ((__MINGW32_MAJOR_VERSION == 1 \
-  && __MINGW32_MINOR_VERSION >= 3) \
- || (__MINGW32_MAJOR_VERSION >= 2))
-#endif
-
-#define OLD_MINGW (!(STD_MINGW))
-
 #ifndef MAXPATHLEN
 #define MAXPATHLEN MAX_PATH
 #endif
diff --git a/gcc/ada/sysdep.c b/gcc/ada/sysdep.c
index 5e9cf709082..7bdfcbc047c 100644
--- a/gcc/ada/sysdep.c
+++ b/gcc/ada/sysdep.c
@@ -323,11 +323,7 @@ __gnat_ttyname (int filedes ATTRIBUTE_UNUSED)
   || defined (__QNX__)
 
 # ifdef __MINGW32__
-#  if OLD_MINGW
-#   include 
-#  else
-#   include   /* for getch(), kbhit() */
-#  endif
+#  include   /* for getch(), kbhit() */
 # else
 #  include 
 # endif
-- 
2.25.1



[PATCH] aarch64: Add -march support for Armv9.1-A, Armv9.2-A, Armv9.3-A

2022-09-26 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This is a straightforward patch that allows targeting the architecture 
revisions mentioned in the subject
through -march. These are already supported in binutils.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-arches.def (armv9.1-a): Define.
(armv9.2-a): Likewise.
(armv9.3-a): Likewise.
* config/aarch64/aarch64.h (AARCH64_FL_V9_1): Likewise.
(AARCH64_FL_V9_2): Likewise.
(AARCH64_FL_V9_3): Likewise.
(AARCH64_FL_FOR_ARCH9_1): Likewise.
(AARCH64_FL_FOR_ARCH9_2): Likewise.
(AARCH64_FL_FOR_ARCH9_3): Likewise.
(AARCH64_ISA_V9_1): Likewise.
(AARCH64_ISA_V9_2): Likewise.
(AARCH64_ISA_V9_3): Likewise.
* doc/invoke.texi (AArch64 Options): Document armv9.1-a, armv9.2-a,
armv9.3-a values to -march.


v9-x.patch
Description: v9-x.patch


[COMMITED] ada: Improve accessibility check generation

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Justin Squirek 

Improve accessibility check generation by more precisely identifying cases in
which an Original_Node call is needed.

Instead of grabbing the Original_Node of a prefix in all cases (since this
can cause issues where unanalyzed instance names get referenced) we only
obtain the original node when said prefix comes as a result of expanding
function calls.

gcc/ada/

* sem_util.adb
(Accessibility_Level): Modify indexed and selected components case
by reducing the scope where Original_Node gets used.
---
 gcc/ada/sem_util.adb | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index b0babeb9d6f..c43a008ae5d 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -531,7 +531,7 @@ package body Sem_Util is
 
   --  Local variables
 
-  E   : Entity_Id := Original_Node (Expr);
+  E   : Node_Id := Original_Node (Expr);
   Pre : Node_Id;
 
--  Start of processing for Accessibility_Level
@@ -777,8 +777,18 @@ package body Sem_Util is
 
  --  We don't handle function calls in prefix notation correctly ???
 
- when N_Indexed_Component | N_Selected_Component =>
-Pre := Original_Node (Prefix (E));
+ when N_Indexed_Component | N_Selected_Component | N_Slice =>
+Pre := Prefix (E);
+
+--  Fetch the original node when the prefix comes from the result
+--  of expanding a function call since we want to find the level
+--  of the original source call.
+
+if not Comes_From_Source (Pre)
+  and then Nkind (Original_Node (Pre)) = N_Function_Call
+then
+   Pre := Original_Node (Pre);
+end if;
 
 --  When E is an indexed component or selected component and
 --  the current Expr is a function call, we know that we are
-- 
2.25.1



[COMMITED] ada: Remove socket definitions for ancient MinGW

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Modern MinGW defines _WIN32_WINNT as 0xa00, so there is no need go guard
against it being lower than 0x0600 or setting it to 0x0501.

gcc/ada/

* gsocket.h: Remove redefinition of _WIN32_WINNT.
* mingw32.h: Remove conditional definition of _WIN32_WINNT.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gsocket.h | 6 --
 gcc/ada/mingw32.h | 5 -
 2 files changed, 11 deletions(-)

diff --git a/gcc/ada/gsocket.h b/gcc/ada/gsocket.h
index e7284a1ef4e..561f2ffb566 100644
--- a/gcc/ada/gsocket.h
+++ b/gcc/ada/gsocket.h
@@ -80,12 +80,6 @@
 #define FD_SETSIZE 1024
 
 #ifdef __MINGW32__
-/* winsock2.h allows WSAPoll related definitions only when
- * _WIN32_WINNT >= 0x0600 */
-#if !defined(_WIN32_WINNT) || _WIN32_WINNT < 0x0600
-#define _WIN32_WINNT 0x0600
-#endif
-
 #include 
 #include 
 #include 
diff --git a/gcc/ada/mingw32.h b/gcc/ada/mingw32.h
index a190d51076f..d038211a1dc 100644
--- a/gcc/ada/mingw32.h
+++ b/gcc/ada/mingw32.h
@@ -44,11 +44,6 @@
 #define UNICODE  /* For Win32 API */
 #endif
 
-/* We need functionality available only starting with Windows XP */
-#ifndef _WIN32_WINNT
-#define _WIN32_WINNT 0x0501
-#endif
-
 #ifndef __CYGWIN__
 #include 
 #endif
-- 
2.25.1



[COMMITED] ada: Delay expansion of iterated component association

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

When preanalysing spec expression (e.g. expression of an expression
function), the name of iterator specification within an iterated
component association should not be expanded, especially in GNATprove
mode.

gcc/ada/

* sem_ch5.adb (Analyze_Iterator_Specification): Delay expansion of
for iterated component association just like it is done within
quantified expression.
---
 gcc/ada/sem_ch5.adb | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/sem_ch5.adb b/gcc/ada/sem_ch5.adb
index 17bf6d91b44..6d07f3d09e5 100644
--- a/gcc/ada/sem_ch5.adb
+++ b/gcc/ada/sem_ch5.adb
@@ -2429,11 +2429,12 @@ package body Sem_Ch5 is
 
   if not Is_Entity_Name (Iter_Name)
 
---  When the context is a quantified expression, the renaming
---  declaration is delayed until the expansion phase if we are
---  doing expansion.
+--  When the context is a quantified expression or iterated component
+--  association, the renaming declaration is delayed until the
+--  expansion phase if we are doing expansion.
 
-and then (Nkind (Parent (N)) /= N_Quantified_Expression
+and then (Nkind (Parent (N)) not in N_Quantified_Expression
+  | N_Iterated_Component_Association
or else (Operating_Mode = Check_Semantics
 and then not GNATprove_Mode))
 
-- 
2.25.1



[COMMITED] ada: Only reject volatile ghost objects when SPARK_Mode is On

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

SPARK rule that forbids ghost volatile objects is only affecting proof
and not generation of object code. It is now only applied where SPARK_Mode
is On. This flexibility is needed to compile code automatically instrumented
by GNATcoverage.

gcc/ada/

* contracts.adb (Analyze_Object_Contract): Check SPARK_Mode before
applying SPARK rule.
---
 gcc/ada/contracts.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/contracts.adb b/gcc/ada/contracts.adb
index 34db67a8cab..dd573d374c6 100644
--- a/gcc/ada/contracts.adb
+++ b/gcc/ada/contracts.adb
@@ -1207,7 +1207,7 @@ package body Contracts is
  --  A Ghost object cannot be effectively volatile (SPARK RM 6.9(7) and
  --  SPARK RM 6.9(19)).
 
- elsif Is_Effectively_Volatile (Obj_Id) then
+ elsif SPARK_Mode = On and then Is_Effectively_Volatile (Obj_Id) then
 Error_Msg_N ("ghost object & cannot be volatile", Obj_Id);
 
  --  A Ghost object cannot be imported or exported (SPARK RM 6.9(7)).
-- 
2.25.1



[COMMITED] ada: Document support for the mold linker

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Kévin Le Gouguec 

gcc/ada/

* doc/gnat_ugn/building_executable_programs_with_gnat.rst
(Linker Switches): Document support for mold along with gold; add some
advice regarding OpenSSL in the Pro version.
* gnat_ugn.texi: Regenerate.
---
 ...building_executable_programs_with_gnat.rst | 28 +--
 gcc/ada/gnat_ugn.texi | 10 +++
 2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst 
b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
index 6a478095cfc..f675732aae2 100644
--- a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
+++ b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
@@ -6229,11 +6229,33 @@ Linker switches can be specified after :switch:`-largs` 
builder switch.
 .. index:: -fuse-ld=name
 
 :switch:`-fuse-ld={name}`
-  Linker to be used. The default is ``bfd`` for :file:`ld.bfd`,
-  the alternative being ``gold`` for :file:`ld.gold`. The later is
-  a more recent and faster linker, but only available on GNU/Linux
+  Linker to be used. The default is ``bfd`` for :file:`ld.bfd`; ``gold``
+  (for :file:`ld.gold`) and ``mold`` (for :file:`ld.mold`) are more
+  recent and faster alternatives, but only available on GNU/Linux
   platforms.
 
+  .. only:: PRO
+
+The GNAT distribution for native Linux platforms includes ``mold``,
+compiled against OpenSSL version 1.1; however, the distribution does
+not include OpenSSL.  In order to use this linker, you may either:
+
+* use your system's OpenSSL library, if the version matches: in this
+  situation, you need not do anything beside using the
+  :switch:`-fuse-ld=mold` switch,
+
+* obtain a source distribution for OpenSSL 1.1, compile the
+  :file:`libcrypto.so` library and install it in the directory of
+  your choice, then include this directory in the
+  :envvar:`LD_LIBRARY_PATH` environment variable,
+
+* install another copy of ``mold`` by other means in the directory
+  of your choice, and include this directory in the :envvar:`PATH`
+  environment variable; you may find this alternative preferable if
+  the copy of ``mold`` included in GNAT does not suit your needs
+  (e.g. being able to link against your system's OpenSSL, or using
+  another version of ``mold``).
+
 .. _Binding_with_gnatbind:
 
 Binding with ``gnatbind``
diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
index f2cb1ed638a..d7bcf74e278 100644
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -19,7 +19,7 @@
 
 @copying
 @quotation
-GNAT User's Guide for Native Platforms , Sep 09, 2022
+GNAT User's Guide for Native Platforms , Sep 26, 2022
 
 AdaCore
 
@@ -15317,10 +15317,11 @@ Linker switches can be specified after @code{-largs} 
builder switch.
 
 @item @code{-fuse-ld=`name'}
 
-Linker to be used. The default is @code{bfd} for @code{ld.bfd},
-the alternative being @code{gold} for @code{ld.gold}. The later is
-a more recent and faster linker, but only available on GNU/Linux
+Linker to be used. The default is @code{bfd} for @code{ld.bfd}; @code{gold}
+(for @code{ld.gold}) and @code{mold} (for @code{ld.mold}) are more
+recent and faster alternatives, but only available on GNU/Linux
 platforms.
+
 @end table
 
 @node Binding with gnatbind,Linking with gnatlink,Linker Switches,Building 
Executable Programs with GNAT
@@ -17932,7 +17933,6 @@ instr.ads
 
 
 
-
 @c -- Example: A |withing| unit has a |with| clause, it |withs| a |withed| unit
 
 @node GNAT and Program Execution,Platform-Specific Information,GNAT Utility 
Programs,Top
-- 
2.25.1



[COMMITED] ada: Document Long_Long_Long_Size parameter for -gnateT

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This was overlooked when the new parameter was created.

gcc/ada/

* doc/gnat_ugn/building_executable_programs_with_gnat.rst
(-gnateT): Document new parameter Long_Long_Long_Size.
* gnat_ugn.texi: Regenerate.
---
 .../doc/gnat_ugn/building_executable_programs_with_gnat.rst   | 2 ++
 gcc/ada/gnat_ugn.texi | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst 
b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
index f675732aae2..d4bddffac60 100644
--- a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
+++ b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
@@ -1719,6 +1719,7 @@ Alphabetical List of All Switches
 Float_Words_BE : Nat; -- Float words stored big-endian?
 Int_Size   : Pos; -- Standard.Integer'Size
 Long_Double_Size   : Pos; -- Standard.Long_Long_Float'Size
+Long_Long_Long_Size: Pos; -- Standard.Long_Long_Long_Integer'Size
 Long_Long_Size : Pos; -- Standard.Long_Long_Integer'Size
 Long_Size  : Pos; -- Standard.Long_Integer'Size
 Maximum_Alignment  : Pos; -- Maximum permitted alignment
@@ -1816,6 +1817,7 @@ Alphabetical List of All Switches
 Float_Words_BE0
 Int_Size 64
 Long_Double_Size128
+Long_Long_Long_Size 128
 Long_Long_Size   64
 Long_Size64
 Maximum_Alignment16
diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
index d7bcf74e278..77d239f797c 100644
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -9220,6 +9220,7 @@ Float_Size : Pos; -- Standard.Float'Size
 Float_Words_BE : Nat; -- Float words stored big-endian?
 Int_Size   : Pos; -- Standard.Integer'Size
 Long_Double_Size   : Pos; -- Standard.Long_Long_Float'Size
+Long_Long_Long_Size: Pos; -- Standard.Long_Long_Long_Integer'Size
 Long_Long_Size : Pos; -- Standard.Long_Long_Integer'Size
 Long_Size  : Pos; -- Standard.Long_Integer'Size
 Maximum_Alignment  : Pos; -- Maximum permitted alignment
@@ -9307,6 +9308,7 @@ Float_Size   32
 Float_Words_BE0
 Int_Size 64
 Long_Double_Size128
+Long_Long_Long_Size 128
 Long_Long_Size   64
 Long_Size64
 Maximum_Alignment16
@@ -29317,8 +29319,8 @@ to permit their use in free software.
 
 @printindex ge
 
-@anchor{gnat_ugn/gnat_utility_programs switches-related-to-project-files}@w{   
   }
 @anchor{cf}@w{  }
+@anchor{gnat_ugn/gnat_utility_programs switches-related-to-project-files}@w{   
   }
 
 @c %**end of body
 @bye
-- 
2.25.1



[COMMITED] ada: Delay expansion of iterator specification in preanalysis

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

When preanalysing spec expression (e.g. expression of an expression
function), the name of iterator specification should not be expanded.

This patch simplifies a complicated condition for delaying expansion
within quantified expressions and iterated component associations.

gcc/ada/

* sem_ch5.adb (Analyze_Iterator_Specification): Delay expansion
based on Full_Analysis flag.
---
 gcc/ada/sem_ch5.adb | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/sem_ch5.adb b/gcc/ada/sem_ch5.adb
index 6d07f3d09e5..d0f00b31161 100644
--- a/gcc/ada/sem_ch5.adb
+++ b/gcc/ada/sem_ch5.adb
@@ -2429,14 +2429,9 @@ package body Sem_Ch5 is
 
   if not Is_Entity_Name (Iter_Name)
 
---  When the context is a quantified expression or iterated component
---  association, the renaming declaration is delayed until the
---  expansion phase if we are doing expansion.
-
-and then (Nkind (Parent (N)) not in N_Quantified_Expression
-  | N_Iterated_Component_Association
-   or else (Operating_Mode = Check_Semantics
-and then not GNATprove_Mode))
+--  Do not perform this expansion in preanalysis
+
+and then Full_Analysis
 
 --  Do not perform this expansion when expansion is disabled, where the
 --  temporary may hide the transformation of a selected component into
-- 
2.25.1



[COMMITED] ada: Improve CUDA host-side and device-side binder support

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Steve Baird 

Binder-generated code is not allowed to use Ada2012 syntax. In order to
specify an aspect, a pragma must be used.

gcc/ada/

* bindgen.adb: When the binder is invoked for the device, specify
the CUDA_Global aspect for the adainit and adafinal procedures via
a pragma instead of via an aspect_specification.
---
 gcc/ada/bindgen.adb | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/gcc/ada/bindgen.adb b/gcc/ada/bindgen.adb
index b2fa44d2dff..f2aaa2dea92 100644
--- a/gcc/ada/bindgen.adb
+++ b/gcc/ada/bindgen.adb
@@ -134,9 +134,6 @@ package body Bindgen is
--  Text for aspect specifications (if any) given as part of the
--  Adainit and Adafinal spec declarations.
 
-   function Aspect_Text return String is
- (if Enable_CUDA_Device_Expansion then " with CUDA_Global" else "");
-
--
-- Interface_State Pragma Table --
--
@@ -2644,10 +2641,11 @@ package body Bindgen is
   end if;
 
   WBI ("");
-  WBI ("   procedure " & Ada_Init_Name.all & Aspect_Text & ";");
+  WBI ("   procedure " & Ada_Init_Name.all & ";");
   if Enable_CUDA_Device_Expansion then
  WBI ("   pragma Export (C, " & Ada_Init_Name.all &
 ", Link_Name => """ & Device_Ada_Init_Link_Name & """);");
+ WBI ("   pragma CUDA_Global (" & Ada_Init_Name.all & ");");
   else
  WBI ("   pragma Export (C, " & Ada_Init_Name.all & ", """ &
   Ada_Init_Name.all & """);");
@@ -2662,11 +2660,12 @@ package body Bindgen is
 
   if not Cumulative_Restrictions.Set (No_Finalization) then
  WBI ("");
- WBI ("   procedure " & Ada_Final_Name.all & Aspect_Text & ";");
+ WBI ("   procedure " & Ada_Final_Name.all & ";");
 
  if Enable_CUDA_Device_Expansion then
 WBI ("   pragma Export (C, " & Ada_Final_Name.all &
", Link_Name => """ & Device_Ada_Final_Link_Name & """);");
+WBI ("   pragma CUDA_Global (" & Ada_Final_Name.all & ");");
  else
 WBI ("   pragma Export (C, " & Ada_Final_Name.all & ", """ &
  Ada_Final_Name.all & """);");
-- 
2.25.1



[COMMITED] ada: Make Original_Aspect_Pragma_Name more precise

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Tucker Taft 

This commit makes Original_Aspect_Pragma_Name more precise in cases
where there is a second level of indirection caused by pragmas being
turned into Check pragmas.

gcc/ada/

* sem_util.adb (Original_Aspect_Pragma_Name): Check for Check
pragmas.
---
 gcc/ada/sem_util.adb | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index c43a008ae5d..9ae082ca2e1 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -26559,6 +26559,14 @@ package body Sem_Util is
  Item_Nam :=
Chars (Original_Node (Pragma_Identifier (Original_Node (Item;
 
+ if Item_Nam = Name_Check then
+--  Pragma "Check" preserves the original pragma name as its first
+--  argument.
+Item_Nam :=
+  Chars (Expression (First (Pragma_Argument_Associations
+(Original_Node (Item);
+ end if;
+
   else
  pragma Assert (Nkind (Item) = N_Aspect_Specification);
  Item_Nam := Chars (Identifier (Item));
-- 
2.25.1



[COMMITED] ada: Remove unreferenced Rtsfind entries

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Remove unreferenced entries for finding runtime units and runtime
entities by the compiler. Code cleanup using basic grep scripting.

gcc/ada/

* rtsfind.ads
(RTU_Id): Remove unreferenced packages; fix whitespace.
(RE_Id): Remove unreferenced entities; add comment about entity
that is only used by GNATprove and not by GNAT.
---
 gcc/ada/rtsfind.ads | 111 +---
 1 file changed, 2 insertions(+), 109 deletions(-)

diff --git a/gcc/ada/rtsfind.ads b/gcc/ada/rtsfind.ads
index 65c64090371..24aca2cf6b6 100644
--- a/gcc/ada/rtsfind.ads
+++ b/gcc/ada/rtsfind.ads
@@ -189,7 +189,6 @@ package Rtsfind is
   --  Children of Interfaces
 
   Interfaces_C,
-  Interfaces_Packed_Decimal,
 
   --  Children of Interfaces.C
 
@@ -205,7 +204,6 @@ package Rtsfind is
   System_Address_To_Access_Conversions,
   System_Arith_64,
   System_Arith_128,
-  System_AST_Handling,
   System_Assertions,
   System_Atomic_Operations,
   System_Atomic_Primitives,
@@ -257,9 +255,6 @@ package Rtsfind is
   System_Fat_LFlt,
   System_Fat_LLF,
   System_Fat_SFlt,
-  System_Fat_VAX_D_Float,
-  System_Fat_VAX_F_Float,
-  System_Fat_VAX_G_Float,
   System_Finalization_Masters,
   System_Finalization_Root,
   System_Fore_Decimal_32,
@@ -288,14 +283,12 @@ package Rtsfind is
   System_Img_LLLI,
   System_Img_LLU,
   System_Img_LLLU,
-  System_Img_Name,
   System_Img_Uns,
   System_Img_WChar,
   System_Interrupts,
   System_Long_Long_Float_Expon,
   System_Machine_Code,
   System_Mantissa,
-  System_Memcop,
   System_Memory,
   System_Multiprocessors,
   System_Pack_03,
@@ -420,10 +413,7 @@ package Rtsfind is
   System_Pack_127,
   System_Parameters,
   System_Partition_Interface,
-  System_Pool_32_Global,
   System_Pool_Global,
-  System_Pool_Empty,
-  System_Pool_Local,
   System_Pool_Size,
   System_Put_Images,
   System_Put_Task_Images,
@@ -440,7 +430,6 @@ package Rtsfind is
   System_Stream_Attributes,
   System_Task_Info,
   System_Tasking,
-  System_Threads,
   System_Unsigned_Types,
   System_Val_Bool,
   System_Val_Char,
@@ -461,7 +450,6 @@ package Rtsfind is
   System_Val_LLLI,
   System_Val_LLU,
   System_Val_LLLU,
-  System_Val_Name,
   System_Val_Uns,
   System_Val_WChar,
   System_Version_Control,
@@ -475,7 +463,6 @@ package Rtsfind is
   System_Wid_LLLI,
   System_Wid_LLU,
   System_Wid_LLLU,
-  System_Wid_Name,
   System_Wid_Uns,
   System_Wid_WChar,
   System_WWd_Char,
@@ -484,7 +471,7 @@ package Rtsfind is
 
   --  Children of System.Atomic_Operations
 
-   System_Atomic_Operations_Test_And_Set,
+  System_Atomic_Operations_Test_And_Set,
 
   --  Children of System.Dim
 
@@ -561,17 +548,13 @@ package Rtsfind is
 
  RE_Set_Deadline,-- Ada.Dispatching.EDF
 
- RE_Code_Loc,-- Ada.Exceptions
  RE_Exception_Id,-- Ada.Exceptions
- RE_Exception_Identity,  -- Ada.Exceptions
  RE_Exception_Information,   -- Ada.Exceptions
  RE_Exception_Message,   -- Ada.Exceptions
  RE_Exception_Name_Simple,   -- Ada.Exceptions
  RE_Exception_Occurrence,-- Ada.Exceptions
- RE_Exception_Occurrence_Access, -- Ada.Exceptions
  RE_Null_Id, -- Ada.Exceptions
  RE_Null_Occurrence, -- Ada.Exceptions
- RE_Poll,-- Ada.Exceptions
  RE_Raise_Exception, -- Ada.Exceptions
  RE_Raise_Exception_Always,  -- Ada.Exceptions
  RE_Raise_From_Controlled_Operation, -- Ada.Exceptions
@@ -596,7 +579,7 @@ package Rtsfind is
  RE_Names,   -- Ada.Interrupts.Names
 
  RE_Clock,   -- Ada.Real_Time
- RE_Clock_Time,  -- Ada.Real_Time
+ RE_Clock_Time,  -- Ada.Real_Time [used by GNATprove]
  RE_Time_Span,   -- Ada.Real_Time
  RE_Time_Span_Zero,  -- Ada.Real_Time
  RO_RT_Time, -- Ada.Real_Time
@@ -612,8 +595,6 @@ package Rtsfind is
  RE_Stream_Element_Array,-- Ada.Streams
  RE_Stream_Element_Offset,   -- Ada.Streams
 
- RE_Stream_Access,   -- Ada.Streams.Stream_IO
-
  RO_SU_Super_String, -- Ada.Strings.Superbounded
 
  RO_WI_Super_String, -- Ada.Strings.Wide_Superbounded
@@ -628,8 +609,6 @@ package Rtsfind is
 
  RE_Buffer_Type, -- Ada.Strings.Text_Buffers.Unbounded
  RE_Get, -- Ada.Strings.Text_Buffers.Unbounded
- RE_Wide_Get,   

[COMMITED] ada: Remove unreferenced C macro from OS constants template

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

The STR/STR1 macros in OS constants template has been unreferenced since
2005, so we can safely remove them.

gcc/ada/

* s-oscons-tmplt.c (STR, STR1): Remove.
---
 gcc/ada/s-oscons-tmplt.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/ada/s-oscons-tmplt.c b/gcc/ada/s-oscons-tmplt.c
index af6919092d5..53941226771 100644
--- a/gcc/ada/s-oscons-tmplt.c
+++ b/gcc/ada/s-oscons-tmplt.c
@@ -237,9 +237,6 @@ int counter = 0;
 #define CST(name,comment) C(#name,String,name,comment)
 /* String constant */
 
-#define STR(x) STR1(x)
-#define STR1(x) #x
-
 #ifdef __MINGW32__
 unsigned int _CRT_fmode = _O_BINARY;
 #endif
-- 
2.25.1



[COMMITED] ada: Doc: rename Valid_Image to Valid_Value

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Ghjuvan Lacambre 

This renaming happened some time ago in the code, but the documentation
was not updated.

gcc/ada/

* doc/gnat_rm/implementation_defined_attributes.rst: Rename Valid_Image.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
---
 .../implementation_defined_attributes.rst |  8 +++
 gcc/ada/gnat_rm.texi  | 22 +--
 gcc/ada/gnat_ugn.texi |  2 +-
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/gcc/ada/doc/gnat_rm/implementation_defined_attributes.rst 
b/gcc/ada/doc/gnat_rm/implementation_defined_attributes.rst
index c25e3d44158..d839b1fd2e7 100644
--- a/gcc/ada/doc/gnat_rm/implementation_defined_attributes.rst
+++ b/gcc/ada/doc/gnat_rm/implementation_defined_attributes.rst
@@ -1623,13 +1623,13 @@ Multi-dimensional arrays can be modified, as shown by 
this example:
 
 which changes element (1,2) to 20 and (3,4) to 30.
 
-Attribute Valid_Image
+Attribute Valid_Value
 ===
-.. index:: Valid_Image
+.. index:: Valid_Value
 
-The ``'Valid_Image`` attribute is defined for enumeration types other than
+The ``'Valid_Value`` attribute is defined for enumeration types other than
 those in package Standard. This attribute is a function that takes
-a String, and returns Boolean. ``T'Valid_Image (S)`` returns True
+a String, and returns Boolean. ``T'Valid_Value (S)`` returns True
 if and only if ``T'Value (S)`` would not raise Constraint_Error.
 
 Attribute Valid_Scalars
diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi
index cdf8605f118..64f2e796d8a 100644
--- a/gcc/ada/gnat_rm.texi
+++ b/gcc/ada/gnat_rm.texi
@@ -19,7 +19,7 @@
 
 @copying
 @quotation
-GNAT Reference Manual , Sep 09, 2022
+GNAT Reference Manual , Sep 23, 2022
 
 AdaCore
 
@@ -433,7 +433,7 @@ Implementation Defined Attributes
 * Attribute Universal_Literal_String:: 
 * Attribute Unrestricted_Access:: 
 * Attribute Update:: 
-* Attribute Valid_Image:: 
+* Attribute Valid_Value:: 
 * Attribute Valid_Scalars:: 
 * Attribute VADS_Size:: 
 * Attribute Value_Size:: 
@@ -10295,7 +10295,7 @@ consideration, you should minimize the use of these 
attributes.
 * Attribute Universal_Literal_String:: 
 * Attribute Unrestricted_Access:: 
 * Attribute Update:: 
-* Attribute Valid_Image:: 
+* Attribute Valid_Value:: 
 * Attribute Valid_Scalars:: 
 * Attribute VADS_Size:: 
 * Attribute Value_Size:: 
@@ -12040,7 +12040,7 @@ In general this is a risky approach. It may appear to 
“work” but such uses o
 @code{Unrestricted_Access} are potentially non-portable, even from one version
 of GNAT to another, so are best avoided if possible.
 
-@node Attribute Update,Attribute Valid_Image,Attribute 
Unrestricted_Access,Implementation Defined Attributes
+@node Attribute Update,Attribute Valid_Value,Attribute 
Unrestricted_Access,Implementation Defined Attributes
 @anchor{gnat_rm/implementation_defined_attributes attribute-update}@anchor{1ac}
 @section Attribute Update
 
@@ -12121,19 +12121,19 @@ A := A'Update ((1, 2) => 20, (3, 4) => 30);
 
 which changes element (1,2) to 20 and (3,4) to 30.
 
-@node Attribute Valid_Image,Attribute Valid_Scalars,Attribute 
Update,Implementation Defined Attributes
-@anchor{gnat_rm/implementation_defined_attributes 
attribute-valid-image}@anchor{1ad}
-@section Attribute Valid_Image
+@node Attribute Valid_Value,Attribute Valid_Scalars,Attribute 
Update,Implementation Defined Attributes
+@anchor{gnat_rm/implementation_defined_attributes 
attribute-valid-value}@anchor{1ad}
+@section Attribute Valid_Value
 
 
-@geindex Valid_Image
+@geindex Valid_Value
 
-The @code{'Valid_Image} attribute is defined for enumeration types other than
+The @code{'Valid_Value} attribute is defined for enumeration types other than
 those in package Standard. This attribute is a function that takes
-a String, and returns Boolean. @code{T'Valid_Image (S)} returns True
+a String, and returns Boolean. @code{T'Valid_Value (S)} returns True
 if and only if @code{T'Value (S)} would not raise Constraint_Error.
 
-@node Attribute Valid_Scalars,Attribute VADS_Size,Attribute 
Valid_Image,Implementation Defined Attributes
+@node Attribute Valid_Scalars,Attribute VADS_Size,Attribute 
Valid_Value,Implementation Defined Attributes
 @anchor{gnat_rm/implementation_defined_attributes 
attribute-valid-scalars}@anchor{1ae}
 @section Attribute Valid_Scalars
 
diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
index 77d239f797c..7d96dbe6aa1 100644
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -29319,8 +29319,8 @@ to permit their use in free software.
 
 @printindex ge
 
-@anchor{cf}@w{  }
 @anchor{gnat_ugn/gnat_utility_programs switches-related-to-project-files}@w{   
   }
+@anchor{cf}@w{  }
 
 @c %**end of body
 @bye
-- 
2.25.1



[COMMITED] ada: Fix location of pragmas coming from aspects in top-level instances

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

This patch fixes an AST anomaly where pragmas that correspond to aspects
of a generic package declaration appeared as the auxiliary declarations
of the compilation unit for the instantiated package body.

In particular, this anomaly happened for aspect Annotate and affected
GNATprove, which didn't pick pragma corresponding to this aspect.

gcc/ada/

* sem_ch12.adb (Build_Instance_Compilation_Unit_Nodes): Relocate
auxiliary declarations from the original compilation unit to the
newly created compilation unit for the spec.
---
 gcc/ada/sem_ch12.adb | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
index 9525140b45b..ab2e1825679 100644
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -6296,13 +6296,16 @@ package body Sem_Ch12 is
   Old_Main   : constant Entity_Id := Cunit_Entity (Main_Unit);
 
begin
-  --  A new compilation unit node is built for the instance declaration
+  --  A new compilation unit node is built for the instance declaration.
+  --  It relocates the auxiliary declaration node from the compilation unit
+  --  where the instance appeared, so that declarations that originally
+  --  followed the instance will be attached to the spec compilation unit.
 
   Decl_Cunit :=
 Make_Compilation_Unit (Sloc (N),
   Context_Items  => Empty_List,
   Unit   => Act_Decl,
-  Aux_Decls_Node => Make_Compilation_Unit_Aux (Sloc (N)));
+  Aux_Decls_Node => Relocate_Node (Aux_Decls_Node (Parent (N;
 
   Set_Parent_Spec (Act_Decl, Parent_Spec (N));
 
-- 
2.25.1



[COMMITED] ada: Remove GNATmetric's documentation from GNAT's documentation

2022-09-26 Thread Marc Poulhiès via Gcc-patches
From: Boris Yakobowski 

gcc/ada/

* doc/gnat_ugn/gnat_utility_programs.rst: Remove documentation for
gnatmetric.
---
 .../doc/gnat_ugn/gnat_utility_programs.rst| 1120 +
 1 file changed, 1 insertion(+), 1119 deletions(-)

diff --git a/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst 
b/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst
index d67083979cc..92877a2d172 100644
--- a/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst
+++ b/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst
@@ -15,7 +15,6 @@ This chapter describes a number of utility programs:
   * :ref:`The_File_Cleanup_Utility_gnatclean`
   * :ref:`The_GNAT_Library_Browser_gnatls`
   * :ref:`The_Coding_Standard_Verifier_gnatcheck`
-  * :ref:`The_GNAT_Metrics_Tool_gnatmetric`
   * :ref:`The_GNAT_Pretty_Printer_gnatpp`
   * :ref:`The_Body_Stub_Generator_gnatstub`
   * :ref:`The_Backtrace_Symbolizer_gnatsymbolize`
@@ -487,1123 +486,6 @@ building specialized scripts.
   For full details, plese refer to :title:`GNATcheck Reference Manual`.
 
 
-
-.. only:: PRO or GPL
-
-  .. _The_GNAT_Metrics_Tool_gnatmetric:
-
-  The GNAT Metrics Tool ``gnatmetric``
-  
-
-  .. index:: ! gnatmetric
-  .. index:: Metric tool
-
-  The ``gnatmetric`` tool is a utility
-  for computing various program metrics.
-  It takes an Ada source file as input and generates a file containing the
-  metrics data as output. Various switches control which
-  metrics are reported.
-
-  ``gnatmetric`` is a project-aware tool
-  (see :ref:`Using_Project_Files_with_GNAT_Tools` for a description of
-  the project-related switches). The project file package that can specify
-  ``gnatmetric`` switches is named ``Metrics``.
-
-  The ``gnatmetric`` command has the form
-
-::
-
-   $ gnatmetric [ switches ] { filename }
-
-  where:
-
-  * ``switches`` specify the metrics to compute and define the destination for
-the output
-
-  * Each ``filename`` is the name of a source file to process. 'Wildcards' are
-allowed, and the file name may contain path information.  If no
-``filename`` is supplied, then the ``switches`` list must contain at least
-one :switch:`--files` switch (see :ref:`Other_gnatmetric_Switches`).
-Including both a :switch:`--files` switch and one or more ``filename``
-arguments is permitted.
-
-Note that it is no longer necessary to specify the Ada language version;
-``gnatmetric`` can process Ada source code written in any version from
-Ada 83 onward without specifying any language version switch.
-
-  The following subsections describe the various switches accepted by
-  ``gnatmetric``, organized by category.
-
-  .. _Output_File_Control-gnatmetric:
-
-  Output File Control
-  ---
-
-  .. index:: Output file control in gnatmetric
-
-  ``gnatmetric`` has two output formats. It can generate a
-  textual (human-readable) form, and also XML. By default only textual
-  output is generated.
-
-  When generating the output in textual form, ``gnatmetric`` creates
-  for each Ada source file a corresponding text file
-  containing the computed metrics, except for the case when the set of metrics
-  specified by gnatmetric parameters consists only of metrics that are computed
-  for the whole set of analyzed sources, but not for each Ada source.
-  By default, the name of the file containing metric information for a source
-  is obtained by appending the :file:`.metrix` suffix to the
-  name of the input source file. If not otherwise specified and no project file
-  is specified as ``gnatmetric`` option this file is placed in the same
-  directory as where the source file is located. If ``gnatmetric`` has a
-  project  file as its parameter, it places all the generated files in the
-  object directory of the project (or in the project source directory if the
-  project does not define an object directory). If :switch:`--subdirs` option
-  is specified, the files are placed in the subrirectory of this directory
-  specified by this option.
-
-  All the output information generated in XML format is placed in a single
-  file. By default the name of this file is :file:`metrix.xml`.
-  If not otherwise specified and if no project file is specified
-  as ``gnatmetric`` option this file is placed in the
-  current directory.
-
-  Some of the computed metrics are summed over the units passed to
-  ``gnatmetric``; for example, the total number of lines of code.
-  By default this information is sent to :file:`stdout`, but a file
-  can be specified with the :switch:`--global-file-name` switch.
-
-  The following switches control the ``gnatmetric`` output:
-
-  .. index:: --generate-xml-output (gnatmetric)
-
-  :switch:`--generate-xml-output`
-Generate XML output.
-
-  .. index:: --generate-xml-schema (gnatmetric)
-
-  :switch:`--generate-xml-schema`
-Generate XML output and an XML schema file that describes the structure
-of the XML metric report. This schema is as

Re: [Patch] OpenACC: Fix reduction tree-sharing issue [PR106982]

2022-09-26 Thread Tobias Burnus

Hi Richard,

On 26.09.22 10:32, Richard Biener wrote:

On Fri, Sep 23, 2022 at 5:25 PM Tobias Burnus 
 wrote:



This fixes a tree-sharing ICE. It seems as if all unshare_expr
I added were required in this case. [...]


looks like v1/v2/v3 are now unshared twice

According to the assert, that's not the case. 'var' is a memory
reference – and taking out any of the newly added unshare_expr
will give an ICE with the new *8.c testcase.

better done when its used.  That said, please put the unshares
at places where new things are built, that's much clearer.  That means
the 'outgoing' at
   gimplify_assign (outgoing, teardown_call, &after_join);


The most localized change is the 'else' branch:

   else
- v1 = v2 = v3 = var;
+ {
+   /* Note that 'var' might be a mem ref.  */
+   v1 = unshare_expr (var);
+   v2 = unshare_expr (var);
+   v3 = unshare_expr (var);
+   incoming = unshare_expr (incoming);
+   outgoing = unshare_expr (outgoing);
+ }

But then I still need to unshare v1/v2/v3 at one other place. Namely:

Either in

-   gimplify_assign (v1, setup_call, &before_fork);
+   gimplify_assign (unshare_expr (v1), setup_call, &before_fork);

or in
 = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION,
 TREE_TYPE (var), 6, init_code,
 unshare_expr (ref_to_res),
- v1, level, op, off);
+ unshare_expr (v1), level, op, off);


Alternatively, I keep the
   else
 v1 = v2 = v3 = var;
as is, possible adding the comment there, – and then add the unshare_expr
for v1/v2/v3/incoming to build_call_expr_internal_loc
*and* for v1/v2/v3/outgoind to gimplify_assign.

Which variant do you prefer?
(I have attached both – and the only difference is in omp-low.cc.)

(Certainly, other permutations are possible, one is the one in the first patch,
but I like either of the two new patches more.)

Tobias


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
 gcc/omp-low.cc | 15 +++
 gcc/testsuite/c-c++-common/goacc/reduction-7.c | 22 ++
 gcc/testsuite/c-c++-common/goacc/reduction-8.c | 12 
 3 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index d9f9aaebc0b..144ccd4bd3d 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -7631,7 +7631,14 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
 	  incoming = build_simple_mem_ref (incoming);
 	  }
 	else
-	  v1 = v2 = v3 = var;
+	  {
+	/* Note that 'var' might be a mem ref.  */
+	v1 = unshare_expr (var);
+	v2 = unshare_expr (var);
+	v3 = unshare_expr (var);
+	incoming = unshare_expr (incoming);
+	outgoing = unshare_expr (outgoing);
+	  }
 
 	/* Determine position in reduction buffer, which may be used
 	   by target.  The parser has ensured that this is not a
@@ -7675,9 +7682,9 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
 	  TREE_TYPE (var), 6, teardown_code,
 	  ref_to_res, v3, level, op, off);
 
-	gimplify_assign (v1, setup_call, &before_fork);
-	gimplify_assign (v2, init_call, &after_fork);
-	gimplify_assign (v3, fini_call, &before_join);
+	gimplify_assign (unshare_expr (v1), setup_call, &before_fork);
+	gimplify_assign (unshare_expr (v2), init_call, &after_fork);
+	gimplify_assign (unshare_expr (v3), fini_call, &before_join);
 	gimplify_assign (outgoing, teardown_call, &after_join);
   }
 
diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-7.c b/gcc/testsuite/c-c++-common/goacc/reduction-7.c
new file mode 100644
index 000..482b0ab1984
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/reduction-7.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+
+/* PR middle-end/106982 */
+
+long long n = 100;
+int multiplicitive_n = 128;
+
+void test1(double *rand, double *a, double *b, double *c)
+{
+#pragma acc data copyin(a[0:10*multiplicitive_n], b[0:10*multiplicitive_n]) copyout(c[0:10])
+{
+#pragma acc parallel loop
+for (int i = 0; i < 10; ++i)
+{
+double temp = 1.0;
+#pragma acc loop vector reduction(*:temp)
+for (int j = 0; j < multiplicitive_n; ++j)
+  temp *= a[(i * multiplicitive_n) + j] + b[(i * multiplicitive_n) + j];
+c[i] = temp;
+}
+}
+}
diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-8.c b/gcc/testsuite/c-c++-common/goacc/reduction-8.c
new file mode 100644
index 000..2c3ed499d5b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/reduction-8.c
@@ -0,0 +1,12 @@
+/* { dg-

Re: [Patch] OpenACC: Fix reduction tree-sharing issue [PR106982]

2022-09-26 Thread Richard Biener via Gcc-patches
On Mon, Sep 26, 2022 at 11:27 AM Tobias Burnus  wrote:
>
> Hi Richard,
>
> On 26.09.22 10:32, Richard Biener wrote:
>
> On Fri, Sep 23, 2022 at 5:25 PM Tobias Burnus  wrote:
>
> This fixes a tree-sharing ICE. It seems as if all unshare_expr
> I added were required in this case. [...]
>
> looks like v1/v2/v3 are now unshared twice
>
> According to the assert, that's not the case. 'var' is a memory
> reference – and taking out any of the newly added unshare_expr
> will give an ICE with the new *8.c testcase.
>
> better done when its used.  That said, please put the unshares
> at places where new things are built, that's much clearer.  That means
> the 'outgoing' at
> gimplify_assign (outgoing, teardown_call, &after_join);
>
> The most localized change is the 'else' branch:
>
>   else
> -  v1 = v2 = v3 = var;
> +  {
> +/* Note that 'var' might be a mem ref.  */
> +v1 = unshare_expr (var);
> +v2 = unshare_expr (var);
> +v3 = unshare_expr (var);
> +incoming = unshare_expr (incoming);
> +outgoing = unshare_expr (outgoing);
> +  }
>
> But then I still need to unshare v1/v2/v3 at one other place. Namely:
>
> Either in
>
> - gimplify_assign (v1, setup_call, &before_fork);
> + gimplify_assign (unshare_expr (v1), setup_call, &before_fork);
>
> or in
>= build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION,
>TREE_TYPE (var), 6, init_code,
>unshare_expr (ref_to_res),
> -  v1, level, op, off);
> +  unshare_expr (v1), level, op, off);
>
>
> Alternatively, I keep the
> else
>  v1 = v2 = v3 = var;
> as is, possible adding the comment there, – and then add the unshare_expr
> for v1/v2/v3/incoming to build_call_expr_internal_loc
> *and* for v1/v2/v3/outgoind to gimplify_assign.
>
> Which variant do you prefer?

I prefer v2a - the unshare_exprs at the sinks where sharing isn't OK.

That variant is OK,

Thanks,
Richard.

> (I have attached both – and the only difference is in omp-low.cc.)
>
> (Certainly, other permutations are possible, one is the one in the first 
> patch,
> but I like either of the two new patches more.)
>
> Tobias
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


[PATCH][pushed] s390: fix wrong refactoring

2022-09-26 Thread Martin Liška
Pushed as obvious (tested by Robin)

Since r13-2251-g1930c5d05ceff2, the refactoring is not 1:1 and we end
up with a wrong rtx type.

gcc/ChangeLog:

* config/s390/s390.cc (s390_rtx_costs): Remove dest variable
and use only dst.
---
 gcc/config/s390/s390.cc | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 3ae586c60eb..9861913af05 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -3648,7 +3648,7 @@ s390_rtx_costs (rtx x, machine_mode mode, int outer_code,
   *total = 0;
   return true;
   case SET: {
-   rtx dest = SET_DEST (x);
+   rtx dst = SET_DEST (x);
rtx src = SET_SRC (x);
 
switch (GET_CODE (src))
@@ -3669,7 +3669,6 @@ s390_rtx_costs (rtx x, machine_mode mode, int outer_code,
 slightly more expensive than a normal load.  */
  *total = COSTS_N_INSNS (1) + 2;
 
- rtx dst = SET_DEST (src);
  rtx then = XEXP (src, 1);
  rtx els = XEXP (src, 2);
 
@@ -3696,25 +3695,25 @@ s390_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
break;
  }
 
-   switch (GET_CODE (dest))
+   switch (GET_CODE (dst))
  {
  case SUBREG:
-   if (!REG_P (SUBREG_REG (dest)))
+   if (!REG_P (SUBREG_REG (dst)))
  *total += rtx_cost (SUBREG_REG (src), VOIDmode, SET, 0, speed);
/* fallthrough */
  case REG:
/* If this is a VR -> VR copy, count the number of
   registers.  */
-   if (VECTOR_MODE_P (GET_MODE (dest)) && REG_P (src))
+   if (VECTOR_MODE_P (GET_MODE (dst)) && REG_P (src))
  {
-   int nregs = s390_hard_regno_nregs (VR0_REGNUM, GET_MODE (dest));
+   int nregs = s390_hard_regno_nregs (VR0_REGNUM, GET_MODE (dst));
*total = COSTS_N_INSNS (nregs);
  }
/* Same for GPRs.  */
else if (REG_P (src))
  {
int nregs
- = s390_hard_regno_nregs (GPR0_REGNUM, GET_MODE (dest));
+ = s390_hard_regno_nregs (GPR0_REGNUM, GET_MODE (dst));
*total = COSTS_N_INSNS (nregs);
  }
else
@@ -3722,7 +3721,7 @@ s390_rtx_costs (rtx x, machine_mode mode, int outer_code,
  *total += rtx_cost (src, mode, SET, 1, speed);
return true;
case MEM: {
- rtx address = XEXP (dest, 0);
+ rtx address = XEXP (dst, 0);
  rtx tmp;
  HOST_WIDE_INT tmp2;
  if (s390_loadrelative_operand_p (address, &tmp, &tmp2))
-- 
2.37.3



Re: [PATCH]middle-end Recognize more conditional comparisons idioms.

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, 23 Sep 2022, Tamar Christina wrote:

> Hi All,
> 
> GCC currently recognizes some of these for signed but not unsigned types.
> It also has trouble dealing with casts in between because these are handled
> by the fold machinery.
> 
> This moves the pattern detection to match.pd instead.

where's the other copy and is it possible to remove it with this patch?

> We fold e.g.:
> 
> uint32_t min1_32u(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
>   uint32_t result;
>   uint32_t m = (a >= b) - 1;
>   result = (c & m) | (d & ~m);
>   return result;
> }
> 
> into a >= b ? c : d for all integral types.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * match.pd: New cond select pattern.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/select_cond_1.c: New test.
>   * gcc.dg/select_cond_2.c: New test.
>   * gcc.dg/select_cond_3.c: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 
> 39da61bf117a6eb2924fc8a6473fb37ddadd60e9..7b8f50410acfd0afafc5606e972cfc4e125d3a5d
>  100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3577,6 +3577,25 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
>(max @2 @1))
>  
> +/* (a & ((c `op` d) - 1)) | (b & ~((c `op` d) - 1)) ->  c `op` d ? a : b.  */
> +(for op (simple_comparison)
> + (simplify
> +  (bit_xor:c
> +   (convert2? @0)
> +   (bit_and:c
> +(convert2? (bit_xor:c @1 @0))
> +(convert3? (negate (convert? (op@4 @2 @3))

I have a hard time matching this up with the comment above.  It looks like

  (T2)@0 ^ ((T2)(@1 ^ @0) & (T3)(-(T)(@2 <=> @3))

I suppose we've canonicalized (a & x) | (b & ~x) somehow.

It also looks like we move coversions inside a negate but not outside
or into a bit_and?  Can we do a better job here to avoid the explosions
in the number of patterns?

> +  /* Alternative form, where some canonicalization were not done due to the
> + arguments being signed.  */

these two comment lines belong ...

> +  (if (INTEGRAL_TYPE_P (type) && tree_zero_one_valued_p (@4))
> +   (convert:type (cond @4 @1 @0

... here?

> + (simplify
> +  (bit_ior:c
> +   (mult:c @0 (convert (convert2? (op@4 @2 @3
> +   (bit_and:c @1 (convert (plus:c integer_minus_onep (convert (op@4 @2 
> @3))

can you add a comment with what you match here as well?  You are using
(op@4 @2 @3) twice, the point of the @4 capture is that the second
occurance could be just @4.  I wonder how we arrived at the 
multiplication here?  That seems somewhat premature :/

Thanks,
Richard.

> +  (if (INTEGRAL_TYPE_P (type) && tree_zero_one_valued_p (@4))
> +   (cond @4 @0 @1
> +
>  /* Simplifications of shift and rotates.  */
>  
>  (for rotate (lrotate rrotate)
> diff --git a/gcc/testsuite/gcc.dg/select_cond_1.c 
> b/gcc/testsuite/gcc.dg/select_cond_1.c
> new file mode 100644
> index 
> ..9eb9959baafe5fffeec24e4e3ae656f8fcfe943c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/select_cond_1.c
> @@ -0,0 +1,97 @@
> +/* { dg-do run } */
> +/* { dg-additional-options "-O2 -std=c99 -fdump-tree-optimized -save-temps" 
> } */
> +
> +#include 
> +
> +__attribute__((noipa, noinline))
> +uint32_t min1_32u(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
> +  uint32_t result;
> +  uint32_t m = (a >= b) - 1;
> +  result = (c & m) | (d & ~m);
> +  return result;
> +}
> +
> +__attribute__((noipa, noinline))
> +uint32_t max1_32u(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
> +  uint32_t result;
> +  uint32_t m = (a <= b) - 1;
> +  result = (c & m) | (d & ~m);
> +  return result;
> +}
> +
> +__attribute__((noipa, noinline))
> +uint32_t min2_32u(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
> +  uint32_t result;
> +  uint32_t m = (a > b) - 1;
> +  result = (c & m) | (d & ~m);
> +  return result;
> +}
> +
> +__attribute__((noipa, noinline))
> +uint32_t max2_32u(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
> +  uint32_t result;
> +  uint32_t m = (a < b) - 1;
> +  result = (c & m) | (d & ~m);
> +  return result;
> +}
> +
> +__attribute__((noipa, noinline))
> +uint32_t min3_32u(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
> +  uint32_t result;
> +  uint32_t m = (a == b) - 1;
> +  result = (c & m) | (d & ~m);
> +  return result;
> +}
> +
> +__attribute__((noipa, noinline))
> +uint32_t max3_32u(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
> +  uint32_t result;
> +  uint32_t m = (a != b) - 1;
> +  result = (c & m) | (d & ~m);
> +  return result;
> +}
> +
> +/* { dg-final { scan-tree-dump-times {_[0-9]+ \? c_[0-9]+\(D\) : 
> d_[0-9]+\(D\)} 6 "optimized" } } */
> +
> +extern void abort ();
> +
> +int main () {
> +
> +  if (min1_32u (3, 5, 7 , 8) != 7)
> +abort ();
> +
> +  if (max1_32u (3, 5, 7 , 8) != 8)
> +abort ();
> +
> +  if (min1_32u (5, 3, 7 , 8) != 8)
> +abort ();
> +
> +  if (max1_32u (5, 3, 7 , 8) != 7)
> +ab

Re: [PATCH]middle-end fix floating out of constants in conditionals

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, 23 Sep 2022, Tamar Christina wrote:

> Hi All,
> 
> The following testcase:
> 
> int zoo1 (int a, int b, int c, int d)
> {
>return (a > b ? c : d) & 1;
> }
> 
> gets de-optimized by the front-end since somewhere around GCC 4.x due to a fix
> that was added to fold_binary_op_with_conditional_arg.
> 
> The folding is supposed to succeed only if we have folded at least one of the
> branches, however the check doesn't tests that all of the values are
> non-constant.  So if one of the operators are a constant it accepts the 
> folding.
> 
> This ends up folding
> 
>return (a > b ? c : d) & 1;
> 
> into
> 
>return (a > b ? c & 1 : d & 1);
> 
> and thus performing the AND twice.
> 
> change changes it to reject the folding if one of the arguments are a constant
> and if the operations being performed are the same.
> 
> Secondly it adds a new match.pd rule to now also fold the opposite direction, 
> so
> it now also folds:
> 
>return (a > b ? c & 1 : d & 1);
> 
> into
> 
>return (a > b ? c : d) & 1;
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and  issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * fold-const.cc (fold_binary_op_with_conditional_arg): Add relaxation.
>   * match.pd: Add ternary constant fold rule.
>   * tree-cfg.cc (verify_gimple_assign_ternary): RHS1 of a COND_EXPR isn't
>   a value but an expression itself. 
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/aarch64/if-compare_3.c: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 
> 4f4ec81c8d4b6937ade3141a14c695b67c874c35..0ee083f290d12104969f1b335dc33917c97b4808
>  100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -7212,7 +7212,9 @@ fold_binary_op_with_conditional_arg (location_t loc,
>  }
>  
>/* Check that we have simplified at least one of the branches.  */
> -  if (!TREE_CONSTANT (arg) && !TREE_CONSTANT (lhs) && !TREE_CONSTANT (rhs))
> +  if ((!TREE_CONSTANT (arg) && !TREE_CONSTANT (lhs) && !TREE_CONSTANT (rhs))
> +  || (TREE_CONSTANT (arg) && TREE_CODE (lhs) == TREE_CODE (rhs)
> +   && !TREE_CONSTANT (lhs)))
>  return NULL_TREE;

I think the better fix would be to only consider TREE_CONSTANT (arg)
if it wasn't constant initially.  Because clearly "simplify" intends
to be "constant" here.  In fact I wonder why we test !TREE_CONSTANT (arg)
at all, we don't simplify 'arg' ...

Eric added this test (previosuly we'd just always done the folding),
but I think not enough?

>  
>return fold_build3_loc (loc, cond_code, type, test, lhs, rhs);
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 
> b225d36dc758f1581502c8d03761544bfd499c01..b61ed70e69b881a49177f10f20c1f92712bb8665
>  100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -4318,6 +4318,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(op @3 (vec_cond:s @0 @1 @2))
>(vec_cond @0 (op! @3 @1) (op! @3 @2
>  
> +/* Float out binary operations from branches if they can't be folded.
> +   Fold (a ? (b op c) : (d op c)) --> (op (a ? b : d) c).  */
> +(for op (plus mult min max bit_and bit_ior bit_xor minus lshift rshift rdiv
> +  trunc_div ceil_div floor_div round_div trunc_mod ceil_mod floor_mod
> +  round_mod)
> + (simplify
> +  (cond @0 (op @1 @2) (op @3 @2))
> +   (if (!FLOAT_TYPE_P (type) || !(HONOR_NANS (@1) && flag_trapping_math))
> +(op (cond @0 @1 @3) @2

Ick.  Adding a reverse tranform is going to be prone to recursing :/

Why do you need to care about NANs or FP exceptions?  How do you know
if they can't be folded?  Since match.pd cannot handle arbitrary
operations it isn't a good fit for match.pd patterns, instead this would
be a forwprop pattern or, in case you want to catch GENERIC, a 
fold-const.cc one?

Thanks and sorry for the late reply - hope Jeffs approval didn't make
you apply it yet.

Richard.

> +
>  #if GIMPLE
>  (match (nop_atomic_bit_test_and_p @0 @1 @4)
>   (bit_and (convert?@4 (ATOMIC_FETCH_OR_XOR_N @2 INTEGER_CST@0 @3))
> diff --git a/gcc/testsuite/gcc.target/aarch64/if-compare_3.c 
> b/gcc/testsuite/gcc.target/aarch64/if-compare_3.c
> new file mode 100644
> index 
> ..1d97da5c0d6454175881c219927471a567a6f0c7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/if-compare_3.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3 -std=c99" } */
> +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */
> +
> +/*
> +**zoo:
> +**   cmp w0, w1
> +**   cselw0, w3, w2, le
> +**   ret
> +*/
> +int zoo (int a, int b, int c, int d)
> +{
> +   return a > b ? c : d;
> +}
> +
> +/*
> +**zoo1:
> +**   cmp w0, w1
> +**   cselw0, w3, w2, le
> +**   and w0, w0, 1
> +**   ret
> +*/
> +int zoo1 (int a, int b, int c, int d)
> +{
> +   return (a > b ? c : d) & 1;
> +}
> +
> diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
> index 
> b19710392940cf469de52d006603ae1e

Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, 23 Sep 2022, Tamar Christina wrote:

> Hi All,
> 
> This is an RFC to figure out how to deal with targets that don't have native
> comparisons against QImode values.
> 
> Booleans, at least in C99 and higher are 0-1 valued.  This means that we only
> really need to test a single bit.  However in RTL we no longer have this
> information available and just have an SImode value (due to the promotion of
> QImode to SImode).
>
> This RFC fixes it by emitting an explicit & 1 during the expansion of the
> conditional branch.
> 
> However it's unlikely that we want to do this unconditionally.  Most targets
> I've tested seem to have harmless code changes, like x86 changes from testb to
> andl, $1.
> 
> So I have two questions:
> 
> 1. Should I limit this behind a target macro? Or should I just leave it for 
> all
>targets and deal with the fallout.
> 2. How can I tell whether the C99 0-1 values bools are being used or the older
>0, non-0 variant?
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> However there are some benign codegen changes on x86, testb changed to andl 
> $1.
> 
> This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite 
> common.

How does this help a target?  Why does RTL nonzerop bits not recover this
information and the desired optimization done later during combine
for example?  Why's a SImode compare not OK if there's no QImode compare?
We have undocumented addcc, negcc, etc. patterns, should we have a
andcc pattern for this indicating support for andcc + jump as opposed
to cmpcc + jump?

So - what's the target and what's a testcase?

Thanks,
Richard.
 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * tree.h (tree_zero_one_valued_p): New.
>   * dojump.cc (do_jump): Add & 1 if truth type.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/dojump.cc b/gcc/dojump.cc
> index 
> 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864
>  100644
> --- a/gcc/dojump.cc
> +++ b/gcc/dojump.cc
> @@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
>/* Fall through and generate the normal code.  */
>  default:
>  normal:
> -  temp = expand_normal (exp);
> +  tree cmp = exp;
> +  /* If the expression is a truth type then explicitly generate an & 1
> +  to indicate to the target that it's a zero-one values type.  This
> +  allows the target to further optimize the comparison should it
> +  choose to.  */
> +  if (tree_zero_one_valued_p (exp))
> + {
> +   type = TREE_TYPE (exp);
> +   cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
> + }
> +  temp = expand_normal (cmp);
>do_pending_stack_adjust ();
>/* The RTL optimizers prefer comparisons against pseudos.  */
>if (GET_CODE (temp) == SUBREG)
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 
> 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1
>  100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
>  extern tree signed_type_for (tree);
>  extern tree unsigned_type_for (tree);
>  extern bool is_truth_type_for (tree, tree);
> +extern bool tree_zero_one_valued_p (tree);
>  extern tree truth_type_for (tree);
>  extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
>  extern tree build_pointer_type (tree);
> 
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


[PATCH] reassoc: Handle OFFSET_TYPE like POINTER_TYPE in optimize_range_tests_cmp_bitwise [PR107029[

2022-09-26 Thread Jakub Jelinek via Gcc-patches
Hi!

As the testcase shows, OFFSET_TYPE needs the same treatment as
POINTER_TYPE/REFERENCE_TYPE, otherwise we fail the same during the
newly added verification.  OFFSET_TYPE is signed though, so unlike
POINTER_TYPE/REFERENCE_TYPE it can also trigger with the
x < 0 && y < 0 && z < 0 to (x | y | z) < 0
optimization.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2022-09-26  Jakub Jelinek  

PR tree-optimization/107029
* tree-ssa-reassoc.cc (optimize_range_tests_cmp_bitwise): Treat
OFFSET_TYPE like POINTER_TYPE, except that OFFSET_TYPE may be
signed and so can trigger even the (b % 4) == 3 case.

* g++.dg/torture/pr107029.C: New test.

--- gcc/tree-ssa-reassoc.cc.jj  2022-09-17 08:18:16.935880254 +0200
+++ gcc/tree-ssa-reassoc.cc 2022-09-25 14:46:21.746367580 +0200
@@ -3608,13 +3608,13 @@ optimize_range_tests_cmp_bitwise (enum t
tree type2 = NULL_TREE;
bool strict_overflow_p = false;
candidates.truncate (0);
-   if (POINTER_TYPE_P (type1))
+   if (POINTER_TYPE_P (type1) || TREE_CODE (type1) == OFFSET_TYPE)
  type1 = pointer_sized_int_node;
for (j = i; j; j = chains[j - 1])
  {
tree type = TREE_TYPE (ranges[j - 1].exp);
strict_overflow_p |= ranges[j - 1].strict_overflow_p;
-   if (POINTER_TYPE_P (type))
+   if (POINTER_TYPE_P (type) || TREE_CODE (type) == OFFSET_TYPE)
  type = pointer_sized_int_node;
if ((b % 4) == 3)
  {
@@ -3646,7 +3646,7 @@ optimize_range_tests_cmp_bitwise (enum t
tree type = TREE_TYPE (ranges[j - 1].exp);
if (j == k)
  continue;
-   if (POINTER_TYPE_P (type))
+   if (POINTER_TYPE_P (type) || TREE_CODE (type) == OFFSET_TYPE)
  type = pointer_sized_int_node;
if ((b % 4) == 3)
  {
@@ -3677,10 +3677,20 @@ optimize_range_tests_cmp_bitwise (enum t
op = r->exp;
continue;
  }
-   if (id == l || POINTER_TYPE_P (TREE_TYPE (op)))
+   if (id == l
+   || POINTER_TYPE_P (TREE_TYPE (op))
+   || TREE_CODE (TREE_TYPE (op)) == OFFSET_TYPE)
  {
code = (b % 4) == 3 ? BIT_NOT_EXPR : NOP_EXPR;
tree type3 = id >= l ? type1 : pointer_sized_int_node;
+   if (code == BIT_NOT_EXPR
+   && TREE_CODE (TREE_TYPE (op)) == OFFSET_TYPE)
+ {
+   g = gimple_build_assign (make_ssa_name (type3),
+NOP_EXPR, op);
+   gimple_seq_add_stmt_without_update (&seq, g);
+   op = gimple_assign_lhs (g);
+ }
g = gimple_build_assign (make_ssa_name (type3), code, op);
gimple_seq_add_stmt_without_update (&seq, g);
op = gimple_assign_lhs (g);
@@ -3688,6 +3698,7 @@ optimize_range_tests_cmp_bitwise (enum t
tree type = TREE_TYPE (r->exp);
tree exp = r->exp;
if (POINTER_TYPE_P (type)
+   || TREE_CODE (type) == OFFSET_TYPE
|| (id >= l && !useless_type_conversion_p (type1, type)))
  {
tree type3 = id >= l ? type1 : pointer_sized_int_node;
@@ -3705,7 +3716,7 @@ optimize_range_tests_cmp_bitwise (enum t
op = gimple_assign_lhs (g);
  }
type1 = TREE_TYPE (ranges[k - 1].exp);
-   if (POINTER_TYPE_P (type1))
+   if (POINTER_TYPE_P (type1) || TREE_CODE (type1) == OFFSET_TYPE)
  {
gimple *g
  = gimple_build_assign (make_ssa_name (type1), NOP_EXPR, op);
--- gcc/testsuite/g++.dg/torture/pr107029.C.jj  2022-09-25 14:49:18.427954682 
+0200
+++ gcc/testsuite/g++.dg/torture/pr107029.C 2022-09-25 14:49:00.654197418 
+0200
@@ -0,0 +1,19 @@
+// PR tree-optimization/107029
+// { dg-do compile }
+
+struct S { long long a; int b; };
+long long S::*a;
+int S::*b;
+struct A { void foo (bool, bool); void bar (); int c; };
+
+void
+A::foo (bool a, bool b)
+{
+  c = a || b;
+}
+
+void
+A::bar()
+{
+  foo (a, b);
+}

Jakub



Re: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization.

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, 23 Sep 2022, Tamar Christina wrote:

> Hi All,
> 
> In plenty of image and video processing code it's common to modify pixel 
> values
> by a widening operation and then scale them back into range by dividing by 
> 255.
> 
> e.g.:
> 
>x = y / (2 ^ (bitsize (y)/2)-1
> 
> This patch adds a new target hook can_special_div_by_const, similar to
> can_vec_perm which can be called to check if a target will handle a particular
> division in a special way in the back-end.
> 
> The vectorizer will then vectorize the division using the standard tree code
> and at expansion time the hook is called again to generate the code for the
> division.
> 
> Alot of the changes in the patch are to pass down the tree operands in all 
> paths
> that can lead to the divmod expansion so that the target hook always has the
> type of the expression you're expanding since the types can change the
> expansion.

The type of the expression should be available via the mode and the
signedness, no?  So maybe to avoid having both RTX and TREE on the
target hook pass it a wide_int instead for the divisor?

> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * expmed.h (expand_divmod): Pass tree operands down in addition to RTX.
>   * expmed.cc (expand_divmod): Likewise.
>   * explow.cc (round_push, align_dynamic_address): Likewise.
>   * expr.cc (force_operand, expand_expr_divmod): Likewise.
>   * optabs.cc (expand_doubleword_mod, expand_doubleword_divmod):
>   Likewise.
>   * target.h: Include tree-core.
>   * target.def (can_special_div_by_const): New.
>   * targhooks.cc (default_can_special_div_by_const): New.
>   * targhooks.h (default_can_special_div_by_const): New.
>   * tree-vect-generic.cc (expand_vector_operation): Use it.
>   * doc/tm.texi.in: Document it.
>   * doc/tm.texi: Regenerate.
>   * tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for support.
>   * tree-vect-stmts.cc (vectorizable_operation): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-div-bitmask-1.c: New test.
>   * gcc.dg/vect/vect-div-bitmask-2.c: New test.
>   * gcc.dg/vect/vect-div-bitmask-3.c: New test.
>   * gcc.dg/vect/vect-div-bitmask.h: New file.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 
> 92bda1a7e14a3c9ea63e151e4a49a818bf4d1bdb..adba9fe97a9b43729c5e86d244a2a23e76cac097
>  100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6112,6 +6112,22 @@ instruction pattern.  There is no need for the hook to 
> handle these two
>  implementation approaches itself.
>  @end deftypefn
>  
> +@deftypefn {Target Hook} bool TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST 
> (enum @var{tree_code}, tree @var{vectype}, tree @var{treeop0}, tree 
> @var{treeop1}, rtx *@var{output}, rtx @var{in0}, rtx @var{in1})
> +This hook is used to test whether the target has a special method of
> +division of vectors of type @var{vectype} using the two operands 
> @code{treeop0},
> +and @code{treeop1} and producing a vector of type @var{vectype}.  The 
> division
> +will then not be decomposed by the and kept as a div.
> +
> +When the hook is being used to test whether the target supports a special
> +divide, @var{in0}, @var{in1}, and @var{output} are all null.  When the hook
> +is being used to emit a division, @var{in0} and @var{in1} are the source
> +vectors of type @var{vecttype} and @var{output} is the destination vector of
> +type @var{vectype}.
> +
> +Return true if the operation is possible, emitting instructions for it
> +if rtxes are provided and updating @var{output}.
> +@end deftypefn
> +
>  @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION 
> (unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in})
>  This hook should return the decl of a function that implements the
>  vectorized variant of the function with the @code{combined_fn} code
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 
> 112462310b134705d860153294287cfd7d4af81d..d5a745a02acdf051ea1da1b04076d058c24ce093
>  100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4164,6 +4164,8 @@ address;  but often a machine-dependent strategy can 
> generate better code.
>  
>  @hook TARGET_VECTORIZE_VEC_PERM_CONST
>  
> +@hook TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST
> +
>  @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
>  
>  @hook TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION
> diff --git a/gcc/explow.cc b/gcc/explow.cc
> index 
> ddb4d6ae3600542f8d2bb5617cdd3933a9fae6c0..568e0eb1a158c696458ae678f5e346bf34ba0036
>  100644
> --- a/gcc/explow.cc
> +++ b/gcc/explow.cc
> @@ -1037,7 +1037,7 @@ round_push (rtx size)
>   TRUNC_DIV_EXPR.  */
>size = expand_binop (Pmode, add_optab, size, alignm1_rtx,
>  NULL_RTX, 1, OPTAB_LIB_WIDEN);
> -  size = expand_divmod (0, TRUNC_D

RE: [PATCH]middle-end simplify complex if expressions where comparisons are inverse of one another.

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, 23 Sep 2022, Tamar Christina wrote:

> > -Original Message-
> > From: Richard Biener 
> > Sent: Friday, September 23, 2022 9:10 AM
> > To: Tamar Christina 
> > Cc: Andrew Pinski ; nd ; gcc-
> > patc...@gcc.gnu.org
> > Subject: RE: [PATCH]middle-end simplify complex if expressions where
> > comparisons are inverse of one another.
> > 
> > On Fri, 23 Sep 2022, Tamar Christina wrote:
> > 
> > > Hello,
> > >
> > > > where logical_inverted is somewhat contradicting using
> > > > zero_one_valued instead of truth_valued_p (I think the former might
> > > > not work for vector booleans?).
> > > >
> > > > In the end I'd prefer zero_one_valued_p but avoiding
> > > > inverse_conditions_p would be nice.
> > > >
> > > > Richard.
> > >
> > > It's not pretty but I've made it work and added more tests.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > > and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * match.pd: Add new rule.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * gcc.target/aarch64/if-compare_1.c: New test.
> > >   * gcc.target/aarch64/if-compare_2.c: New test.
> > >
> > > --- inline copy of patch ---
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > >
> > b61ed70e69b881a49177f10f20c1f92712bb8665..39da61bf117a6eb2924fc8a647
> > 3f
> > > b37ddadd60e9 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -1903,6 +1903,101 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >   (if (INTEGRAL_TYPE_P (type))
> > >(bit_and @0 @1)))
> > >
> > > +(for cmp (tcc_comparison)
> > > + icmp (inverted_tcc_comparison)
> > > + /* Fold (((a < b) & c) | ((a >= b) & d)) into (a < b ? c : d) & 1.
> > > +*/  (simplify
> > > +  (bit_ior
> > > +   (bit_and:c (convert? zero_one_valued_p@0) @2)
> > > +   (bit_and:c (convert? zero_one_valued_p@1) @3))
> > > +(with {
> > > +  enum tree_code c1
> > > + = (TREE_CODE (@0) == SSA_NAME
> > > +? gimple_assign_rhs_code (SSA_NAME_DEF_STMT (@0)) :
> > TREE_CODE
> > > +(@0));
> > > +
> > > +  enum tree_code c2
> > > + = (TREE_CODE (@1) == SSA_NAME
> > > +? gimple_assign_rhs_code (SSA_NAME_DEF_STMT (@1)) :
> > TREE_CODE (@1));
> > > + }
> > > +(if (INTEGRAL_TYPE_P (type)
> > > +  && c1 == cmp
> > > +  && c2 == icmp
> > 
> > So that doesn't have any advantage over doing
> > 
> >  (simplify
> >   (bit_ior
> >(bit_and:c (convert? (cmp@0 @01 @02)) @2)
> >(bit_and:c (convert? (icmp@1 @11 @12)) @3)) ...
> > 
> > I don't remember if that's what we had before.
> 
> No, the specific problem has always been applying zero_one_valued_p to the 
> right type.
> Before it was much shorter because I was using the tree  helper function to 
> get the inverses.
> 
> But with your suggestion I think I can do zero_one_valued_p on @0 and @1 
> instead..

But with comparsions and INTEGRAL_TYPE_P the value is always zero or one
so I'm confused.

> > 
> > > +  /* The scalar version has to be canonicalized after vectorization
> > > + because it makes unconditional loads conditional ones, which
> > > + means we lose vectorization because the loads may trap.  */
> > > +  && canonicalize_math_after_vectorization_p ())
> > > + (bit_and (cond @0 @2 @3) { build_one_cst (type); }
> > > +
> > > + /* Fold ((-(a < b) & c) | (-(a >= b) & d)) into a < b ? c : d.  */
> > 
> > The comment doesn't match the pattern below?
> 
> The pattern in the comment gets rewritten to this form eventually,
> so I match it instead.  I can update the comment but I thought the above
> made it more clear why these belong together ?

Please mention the canonicalized form in the comment as well.

> > 
> > > + (simplify
> > > +  (bit_ior
> > > +   (cond zero_one_valued_p@0 @2 zerop)
> > > +   (cond zero_one_valued_p@1 @3 zerop))
> > > +(with {
> > > +  enum tree_code c1
> > > + = (TREE_CODE (@0) == SSA_NAME
> > > +? gimple_assign_rhs_code (SSA_NAME_DEF_STMT (@0)) :
> > TREE_CODE
> > > +(@0));
> > > +
> > > +  enum tree_code c2
> > > + = (TREE_CODE (@1) == SSA_NAME
> > > +? gimple_assign_rhs_code (SSA_NAME_DEF_STMT (@1)) :
> > TREE_CODE (@1));
> > > + }
> > > +(if (INTEGRAL_TYPE_P (type)
> > > +  && c1 == cmp
> > > +  && c2 == icmp
> > > +  /* The scalar version has to be canonicalized after vectorization
> > > + because it makes unconditional loads conditional ones, which
> > > + means we lose vectorization because the loads may trap.  */
> > > +  && canonicalize_math_after_vectorization_p ())
> > > +(cond @0 @2 @3
> > > +
> > > + /* Vector Fold (((a < b) & c) | ((a >= b) & d)) into a < b ? c : d.
> > > +and ((~(a < b) & c) | (~(a >= b) & d)) into a < b ? c : d.  */
> > > +(simplify
> > > +  (bit_ior
> > > +   (bit_and:c (vec_cond:s @0 @4 @5) @2)
> > > +   (bit_and:c (vec_cond:s @1 @4 @5) @3))
> > > +(with {
> > > +  enum tree_code c1
> > > + = (TREE_CODE (@0) == SSA_NAME
> > > +? gimple_assign_rhs_code (S

Re: [PATCH] reassoc: Handle OFFSET_TYPE like POINTER_TYPE in optimize_range_tests_cmp_bitwise [PR107029[

2022-09-26 Thread Richard Biener via Gcc-patches
On Mon, 26 Sep 2022, Jakub Jelinek wrote:

> Hi!
> 
> As the testcase shows, OFFSET_TYPE needs the same treatment as
> POINTER_TYPE/REFERENCE_TYPE, otherwise we fail the same during the
> newly added verification.  OFFSET_TYPE is signed though, so unlike
> POINTER_TYPE/REFERENCE_TYPE it can also trigger with the
> x < 0 && y < 0 && z < 0 to (x | y | z) < 0
> optimization.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

OK.

> 2022-09-26  Jakub Jelinek  
> 
>   PR tree-optimization/107029
>   * tree-ssa-reassoc.cc (optimize_range_tests_cmp_bitwise): Treat
>   OFFSET_TYPE like POINTER_TYPE, except that OFFSET_TYPE may be
>   signed and so can trigger even the (b % 4) == 3 case.
> 
>   * g++.dg/torture/pr107029.C: New test.
> 
> --- gcc/tree-ssa-reassoc.cc.jj2022-09-17 08:18:16.935880254 +0200
> +++ gcc/tree-ssa-reassoc.cc   2022-09-25 14:46:21.746367580 +0200
> @@ -3608,13 +3608,13 @@ optimize_range_tests_cmp_bitwise (enum t
>   tree type2 = NULL_TREE;
>   bool strict_overflow_p = false;
>   candidates.truncate (0);
> - if (POINTER_TYPE_P (type1))
> + if (POINTER_TYPE_P (type1) || TREE_CODE (type1) == OFFSET_TYPE)
> type1 = pointer_sized_int_node;
>   for (j = i; j; j = chains[j - 1])
> {
>   tree type = TREE_TYPE (ranges[j - 1].exp);
>   strict_overflow_p |= ranges[j - 1].strict_overflow_p;
> - if (POINTER_TYPE_P (type))
> + if (POINTER_TYPE_P (type) || TREE_CODE (type) == OFFSET_TYPE)
> type = pointer_sized_int_node;
>   if ((b % 4) == 3)
> {
> @@ -3646,7 +3646,7 @@ optimize_range_tests_cmp_bitwise (enum t
>   tree type = TREE_TYPE (ranges[j - 1].exp);
>   if (j == k)
> continue;
> - if (POINTER_TYPE_P (type))
> + if (POINTER_TYPE_P (type) || TREE_CODE (type) == OFFSET_TYPE)
> type = pointer_sized_int_node;
>   if ((b % 4) == 3)
> {
> @@ -3677,10 +3677,20 @@ optimize_range_tests_cmp_bitwise (enum t
>   op = r->exp;
>   continue;
> }
> - if (id == l || POINTER_TYPE_P (TREE_TYPE (op)))
> + if (id == l
> + || POINTER_TYPE_P (TREE_TYPE (op))
> + || TREE_CODE (TREE_TYPE (op)) == OFFSET_TYPE)
> {
>   code = (b % 4) == 3 ? BIT_NOT_EXPR : NOP_EXPR;
>   tree type3 = id >= l ? type1 : pointer_sized_int_node;
> + if (code == BIT_NOT_EXPR
> + && TREE_CODE (TREE_TYPE (op)) == OFFSET_TYPE)
> +   {
> + g = gimple_build_assign (make_ssa_name (type3),
> +  NOP_EXPR, op);
> + gimple_seq_add_stmt_without_update (&seq, g);
> + op = gimple_assign_lhs (g);
> +   }
>   g = gimple_build_assign (make_ssa_name (type3), code, op);
>   gimple_seq_add_stmt_without_update (&seq, g);
>   op = gimple_assign_lhs (g);
> @@ -3688,6 +3698,7 @@ optimize_range_tests_cmp_bitwise (enum t
>   tree type = TREE_TYPE (r->exp);
>   tree exp = r->exp;
>   if (POINTER_TYPE_P (type)
> + || TREE_CODE (type) == OFFSET_TYPE
>   || (id >= l && !useless_type_conversion_p (type1, type)))
> {
>   tree type3 = id >= l ? type1 : pointer_sized_int_node;
> @@ -3705,7 +3716,7 @@ optimize_range_tests_cmp_bitwise (enum t
>   op = gimple_assign_lhs (g);
> }
>   type1 = TREE_TYPE (ranges[k - 1].exp);
> - if (POINTER_TYPE_P (type1))
> + if (POINTER_TYPE_P (type1) || TREE_CODE (type1) == OFFSET_TYPE)
> {
>   gimple *g
> = gimple_build_assign (make_ssa_name (type1), NOP_EXPR, op);
> --- gcc/testsuite/g++.dg/torture/pr107029.C.jj2022-09-25 
> 14:49:18.427954682 +0200
> +++ gcc/testsuite/g++.dg/torture/pr107029.C   2022-09-25 14:49:00.654197418 
> +0200
> @@ -0,0 +1,19 @@
> +// PR tree-optimization/107029
> +// { dg-do compile }
> +
> +struct S { long long a; int b; };
> +long long S::*a;
> +int S::*b;
> +struct A { void foo (bool, bool); void bar (); int c; };
> +
> +void
> +A::foo (bool a, bool b)
> +{
> +  c = a || b;
> +}
> +
> +void
> +A::bar()
> +{
> +  foo (a, b);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend

2022-09-26 Thread Tamar Christina via Gcc-patches
> > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite 
> > common.

> How does this help a target?

So the idea is to communicate that only the bottom bit of the value is relevant 
and not the entire value itself.

> Why does RTL nonzerop bits not recover thisinformation and the desired 
> optimization done later during combinefor example?

I'm not sure it works here. We (AArch64) don't have QImode integer registers, 
so our apcs says that the top bits of the 32-bit registers it's passed in are 
undefined.

We have to zero extend the value first if we really want it as an 8-bit value. 
So the problem is if you e.g. Pass a boolean as an argument of a function I 
don't think nonzero bits will return anything useful.

> Why's a SImode compare not OK if there's no QImode compare?

The mode then becomes irrelevant because we're telling the backend that only a 
single bit matters. And we have instructions to test and branch on the value of 
a single bit. See 
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html for the 
testcases

> We have undocumented addcc, negcc, etc. patterns, should we have aandcc 
> pattern for this indicating support for andcc + jump as opposedto cmpcc + 
> jump?

This could work yeah. I didn't know these existed.

> So - what's the target and what's a testcase?

See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html :)

Thanks,
Tamar


From: Richard Biener 
Sent: Monday, September 26, 2022 11:35 AM
To: Tamar Christina 
Cc: gcc-patches@gcc.gnu.org ; nd ; 
jeffreya...@gmail.com ; Richard Sandiford 

Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, 
give hint if argument is a truth type to backend

On Fri, 23 Sep 2022, Tamar Christina wrote:

> Hi All,
>
> This is an RFC to figure out how to deal with targets that don't have native
> comparisons against QImode values.
>
> Booleans, at least in C99 and higher are 0-1 valued.  This means that we only
> really need to test a single bit.  However in RTL we no longer have this
> information available and just have an SImode value (due to the promotion of
> QImode to SImode).
>
> This RFC fixes it by emitting an explicit & 1 during the expansion of the
> conditional branch.
>
> However it's unlikely that we want to do this unconditionally.  Most targets
> I've tested seem to have harmless code changes, like x86 changes from testb to
> andl, $1.
>
> So I have two questions:
>
> 1. Should I limit this behind a target macro? Or should I just leave it for 
> all
>targets and deal with the fallout.
> 2. How can I tell whether the C99 0-1 values bools are being used or the older
>0, non-0 variant?
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> However there are some benign codegen changes on x86, testb changed to andl 
> $1.
>
> This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite 
> common.

How does this help a target?  Why does RTL nonzerop bits not recover this
information and the desired optimization done later during combine
for example?  Why's a SImode compare not OK if there's no QImode compare?
We have undocumented addcc, negcc, etc. patterns, should we have a
andcc pattern for this indicating support for andcc + jump as opposed
to cmpcc + jump?

So - what's the target and what's a testcase?

Thanks,
Richard.

> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>* tree.h (tree_zero_one_valued_p): New.
>* dojump.cc (do_jump): Add & 1 if truth type.
>
> --- inline copy of patch --
> diff --git a/gcc/dojump.cc b/gcc/dojump.cc
> index 
> 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864
>  100644
> --- a/gcc/dojump.cc
> +++ b/gcc/dojump.cc
> @@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
>/* Fall through and generate the normal code.  */
>  default:
>  normal:
> -  temp = expand_normal (exp);
> +  tree cmp = exp;
> +  /* If the expression is a truth type then explicitly generate an & 1
> +  to indicate to the target that it's a zero-one values type.  This
> +  allows the target to further optimize the comparison should it
> +  choose to.  */
> +  if (tree_zero_one_valued_p (exp))
> + {
> +   type = TREE_TYPE (exp);
> +   cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
> + }
> +  temp = expand_normal (cmp);
>do_pending_stack_adjust ();
>/* The RTL optimizers prefer comparisons against pseudos.  */
>if (GET_CODE (temp) == SUBREG)
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 
> 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1
>  100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
>  extern tree signed_type_for (tree);
>  extern tree unsigned_type_for (tree);
>  extern bool is_truth_type_for (tree, tree);
> +e

Re: [PATCH]middle-end fix floating out of constants in conditionals

2022-09-26 Thread Eric Botcazou via Gcc-patches
> I think the better fix would be to only consider TREE_CONSTANT (arg)
> if it wasn't constant initially.  Because clearly "simplify" intends
> to be "constant" here.  In fact I wonder why we test !TREE_CONSTANT (arg)
> at all, we don't simplify 'arg' ...
> 
> Eric added this test (previosuly we'd just always done the folding),
> but I think not enough?

Before my change we had always done the folding *only* for TREE_CONSTANT (arg)
and my change allowed it for some cases of !TREE_CONSTANT (arg), but I did not
want to touch the !TREE_CONSTANT (arg) case at all:

* fold-const.c (fold_binary_op_with_conditional_arg): Do not restrict
the folding to constants.  Remove redundant final conversion.

Tamar's issue appears to be for TREE_CONSTANT (arg) so orthogonal to mine.

-- 
Eric Botcazou




RE: [PATCH]middle-end Add optimized float addsub without needing VEC_PERM_EXPR.

2022-09-26 Thread Richard Biener via Gcc-patches
On Fri, 23 Sep 2022, Tamar Christina wrote:

> > -Original Message-
> > From: Gcc-patches  > bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Tamar
> > Christina via Gcc-patches
> > Sent: Friday, September 23, 2022 9:14 AM
> > To: Richard Biener 
> > Cc: Richard Sandiford ; nd ;
> > Tamar Christina via Gcc-patches ;
> > juzhe.zh...@rivai.ai
> > Subject: RE: [PATCH]middle-end Add optimized float addsub without
> > needing VEC_PERM_EXPR.
> > 
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Friday, September 23, 2022 8:54 AM
> > > To: Tamar Christina 
> > > Cc: Richard Sandiford ; Tamar Christina via
> > > Gcc-patches ; nd ;
> > > juzhe.zh...@rivai.ai
> > > Subject: RE: [PATCH]middle-end Add optimized float addsub without
> > > needing VEC_PERM_EXPR.
> > >
> > > On Fri, 23 Sep 2022, Tamar Christina wrote:
> > >
> > > > Hi,
> > > >
> > > > Attached is the respun version of the patch,
> > > >
> > > > > >>
> > > > > >> Wouldn't a target need to re-check if lanes are NaN or denormal
> > > > > >> if after a SFmode lane operation a DFmode lane operation follows?
> > > > > >> IIRC that is what usually makes punning "integer" vectors as FP
> > > > > >> vectors
> > > costly.
> > > >
> > > > I don't believe this is a problem, due to NANs not being a single
> > > > value and according to the standard the sign bit doesn't change the
> > > meaning of a NAN.
> > > >
> > > > That's why specifically for negates generally no check is performed
> > > > and it's Assumed that if a value is a NaN going in, it's a NaN
> > > > coming out, and this Optimization doesn't change that.  Also under
> > > > fast-math we don't guarantee a stable representation for NaN (or zeros,
> > etc) afaik.
> > > >
> > > > So if that is still a concern I could add && !HONORS_NAN () to the
> > > constraints.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * match.pd: Add fneg/fadd rule.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.target/aarch64/simd/addsub_1.c: New test.
> > > > * gcc.target/aarch64/sve/addsub_1.c: New test.
> > > >
> > > > --- inline version of patch ---
> > > >
> > > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > > >
> > >
> > 1bb936fc4010f98f24bb97671350e8432c55b347..2617d56091dfbd41ae49f980e
> > > e0a
> > > > f3757f5ec1cf 100644
> > > > --- a/gcc/match.pd
> > > > +++ b/gcc/match.pd
> > > > @@ -7916,6 +7916,59 @@ and,
> > > >(simplify (reduc (op @0 VECTOR_CST@1))
> > > >  (op (reduc:type @0) (reduc:type @1
> > > >
> > > > +/* Simplify vector floating point operations of alternating sub/add 
> > > > pairs
> > > > +   into using an fneg of a wider element type followed by a normal add.
> > > > +   under IEEE 754 the fneg of the wider type will negate every even 
> > > > entry
> > > > +   and when doing an add we get a sub of the even and add of every odd
> > > > +   elements.  */
> > > > +(simplify
> > > > + (vec_perm (plus:c @0 @1) (minus @0 @1) VECTOR_CST@2)  (if
> > > > +(!VECTOR_INTEGER_TYPE_P (type) && !BYTES_BIG_ENDIAN)
> > >
> > > shouldn't this be FLOAT_WORDS_BIG_ENDIAN instead?
> > >
> > > I'm still concerned what
> > >
> > >  (neg:V2DF (subreg:V2DF (reg:V4SF) 0))
> > >
> > > means for architectures like RISC-V.  Can one "reformat" FP values in
> > > vector registers so that two floats overlap a double (and then back)?
> > >
> > > I suppose you rely on target_can_change_mode_class to tell you that.
> > 
> > Indeed, the documentation says:
> > 
> > "This hook returns true if it is possible to bitcast values held in 
> > registers of
> > class rclass from mode from to mode to and if doing so preserves the low-
> > order bits that are common to both modes. The result is only meaningful if
> > rclass has registers that can hold both from and to."
> > 
> > This implies to me that if the bitcast shouldn't be possible the hook should
> > reject it.
> > Of course you always where something is possible, but perhaps not cheap to
> > do.
> > 
> > The specific implementation for RISC-V seem to imply to me that they
> > disallow any FP conversions. So seems to be ok.

Ok, I see.

> > >
> > >
> > > > +  (with
> > > > +   {
> > > > + /* Build a vector of integers from the tree mask.  */
> > > > + vec_perm_builder builder;
> > > > + if (!tree_to_vec_perm_builder (&builder, @2))
> > > > +   return NULL_TREE;
> > > > +
> > > > + /* Create a vec_perm_indices for the integer vector.  */
> > > > + poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (type);
> > > > + vec_perm_indices sel (builder, 2, nelts);
> > > > +   }
> > > > +   (if (sel.series_p (0, 2, 0, 2))
> > > > +(with
> > > > + {
> > > > +   machine_mode vec_mode = TYPE_MODE (type);
> > > > +   auto elem_mode = GET_MODE_INNER (vec_mode);
> > > > +   auto nunits = exact_div (GET_MODE_NUNITS (vec_mode), 2)

Re: [PATCH]middle-end fix floating out of constants in conditionals

2022-09-26 Thread Eric Botcazou via Gcc-patches
> Before my change we had always done the folding *only* for TREE_CONSTANT
> (arg) and my change allowed it for some cases of !TREE_CONSTANT (arg), but
> I did not want to touch the !TREE_CONSTANT (arg) case at all:

...to touch the TREE_CONSTANT (arg) case at all...

-- 
Eric Botcazou





Re: [Patch] OpenACC: Fix reduction tree-sharing issue [PR106982]

2022-09-26 Thread Thomas Schwinge
Hi!

On 2022-09-26T11:34:48+0200, Richard Biener via Gcc-patches 
 wrote:
> On Mon, Sep 26, 2022 at 11:27 AM Tobias Burnus  
> wrote:
>> On 26.09.22 10:32, Richard Biener wrote:
>>> On Fri, Sep 23, 2022 at 5:25 PM Tobias Burnus  
>>> wrote:
>>
>>> This fixes a tree-sharing ICE.

Thanks for looking into this!

>> looks like v1/v2/v3 are now unshared twice

> I prefer v2a - the unshare_exprs at the sinks where sharing isn't OK.
>
> That variant is OK,

ACK from me, too.


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend

2022-09-26 Thread Richard Biener via Gcc-patches
On Mon, 26 Sep 2022, Tamar Christina wrote:

> > > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is 
> > > quite common.
> 
> > How does this help a target?
> 
> So the idea is to communicate that only the bottom bit of the value is 
> relevant and not the entire value itself.
> 
> > Why does RTL nonzerop bits not recover thisinformation and the desired 
> > optimization done later during combinefor example?
> 
> I'm not sure it works here. We (AArch64) don't have QImode integer registers, 
> so our apcs says that the top bits of the 32-bit registers it's passed in are 
> undefined.
> 
> We have to zero extend the value first if we really want it as an 8-bit 
> value. So the problem is if you e.g. Pass a boolean as an argument of a 
> function I don't think nonzero bits will return anything useful.
> 
> > Why's a SImode compare not OK if there's no QImode compare?
> 
> The mode then becomes irrelevant because we're telling the backend that only 
> a single bit matters. And we have instructions to test and branch on the 
> value of a single bit. See 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html for the 
> testcases

Maybe the target could use (subreg:SI (reg:BI ...)) as argument.  Heh.

> > We have undocumented addcc, negcc, etc. patterns, should we have aandcc 
> > pattern for this indicating support for andcc + jump as opposedto cmpcc + 
> > jump?
> 
> This could work yeah. I didn't know these existed.

Ah, so they are conditional add, not add setting CC, so andcc wouldn't
be appropriate.

So I'm not sure how we'd handle such situation - maybe looking at
REG_DECL and recognizing a _Bool PARM_DECL is OK?

Richard.

> > So - what's the target and what's a testcase?
> 
> See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html :)
> 
> Thanks,
> Tamar
> 
> 
> From: Richard Biener 
> Sent: Monday, September 26, 2022 11:35 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org ; nd ; 
> jeffreya...@gmail.com ; Richard Sandiford 
> 
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional 
> branches, give hint if argument is a truth type to backend
> 
> On Fri, 23 Sep 2022, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This is an RFC to figure out how to deal with targets that don't have native
> > comparisons against QImode values.
> >
> > Booleans, at least in C99 and higher are 0-1 valued.  This means that we 
> > only
> > really need to test a single bit.  However in RTL we no longer have this
> > information available and just have an SImode value (due to the promotion of
> > QImode to SImode).
> >
> > This RFC fixes it by emitting an explicit & 1 during the expansion of the
> > conditional branch.
> >
> > However it's unlikely that we want to do this unconditionally.  Most targets
> > I've tested seem to have harmless code changes, like x86 changes from testb 
> > to
> > andl, $1.
> >
> > So I have two questions:
> >
> > 1. Should I limit this behind a target macro? Or should I just leave it for 
> > all
> >targets and deal with the fallout.
> > 2. How can I tell whether the C99 0-1 values bools are being used or the 
> > older
> >0, non-0 variant?
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > However there are some benign codegen changes on x86, testb changed to andl 
> > $1.
> >
> > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite 
> > common.
> 
> How does this help a target?  Why does RTL nonzerop bits not recover this
> information and the desired optimization done later during combine
> for example?  Why's a SImode compare not OK if there's no QImode compare?
> We have undocumented addcc, negcc, etc. patterns, should we have a
> andcc pattern for this indicating support for andcc + jump as opposed
> to cmpcc + jump?
> 
> So - what's the target and what's a testcase?
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> >* tree.h (tree_zero_one_valued_p): New.
> >* dojump.cc (do_jump): Add & 1 if truth type.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/dojump.cc b/gcc/dojump.cc
> > index 
> > 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864
> >  100644
> > --- a/gcc/dojump.cc
> > +++ b/gcc/dojump.cc
> > @@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
> >/* Fall through and generate the normal code.  */
> >  default:
> >  normal:
> > -  temp = expand_normal (exp);
> > +  tree cmp = exp;
> > +  /* If the expression is a truth type then explicitly generate an & 1
> > +  to indicate to the target that it's a zero-one values type.  This
> > +  allows the target to further optimize the comparison should it
> > +  choose to.  */
> > +  if (tree_zero_one_valued_p (exp))
> > + {
> > +   type = TREE_TYPE (exp);
> > +   cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (ty

Re: [PATCH] Avoid depending on destructor order

2022-09-26 Thread Claudiu Zissulescu Ianculescu via Gcc-patches
Hi Thomas,

This change prohibits compiling of ARC backend:

> +  gcc_assert (in_shutdown || ob);

in_shutdown is only defined when ATOMIC_FDE_FAST_PATH is defined,
while gcc_assert is outside of any ifdef. Please can you revisit this
line and change it accordingly.

Thanks,
Claudiu


Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend

2022-09-26 Thread Tamar Christina via Gcc-patches
> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.

But then I'd still need to change the expansion code. I suppose this could 
prevent the issue with changes to code on other targets.

> > > We have undocumented addcc, negcc, etc. patterns, should we have aandcc 
> > > pattern for this indicating support for andcc + jump as opposedto cmpcc + 
> > > jump?
> >
> > This could work yeah. I didn't know these existed.

> Ah, so they are conditional add, not add setting CC, so andcc wouldn't
> be appropriate.

> So I'm not sure how we'd handle such situation - maybe looking at
> REG_DECL and recognizing a _Bool PARM_DECL is OK?

I have a slight suspicion that Richard Sandiford would likely reject this 
though.. The additional AND seemed less hacky as it's just communicating range.

I still need to also figure out which representation of bool is being used, 
because only the 0-1 variant works. Is there a way to check that?

Thanks,
Tamar.


From: Richard Biener 
Sent: Monday, September 26, 2022 12:32 PM
To: Tamar Christina 
Cc: gcc-patches@gcc.gnu.org ; nd ; 
jeffreya...@gmail.com ; Richard Sandiford 

Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, 
give hint if argument is a truth type to backend

On Mon, 26 Sep 2022, Tamar Christina wrote:

> > > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is 
> > > quite common.
>
> > How does this help a target?
>
> So the idea is to communicate that only the bottom bit of the value is 
> relevant and not the entire value itself.
>
> > Why does RTL nonzerop bits not recover thisinformation and the desired 
> > optimization done later during combinefor example?
>
> I'm not sure it works here. We (AArch64) don't have QImode integer registers, 
> so our apcs says that the top bits of the 32-bit registers it's passed in are 
> undefined.
>
> We have to zero extend the value first if we really want it as an 8-bit 
> value. So the problem is if you e.g. Pass a boolean as an argument of a 
> function I don't think nonzero bits will return anything useful.
>
> > Why's a SImode compare not OK if there's no QImode compare?
>
> The mode then becomes irrelevant because we're telling the backend that only 
> a single bit matters. And we have instructions to test and branch on the 
> value of a single bit. See 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html for the 
> testcases

Maybe the target could use (subreg:SI (reg:BI ...)) as argument.  Heh.

> > We have undocumented addcc, negcc, etc. patterns, should we have aandcc 
> > pattern for this indicating support for andcc + jump as opposedto cmpcc + 
> > jump?
>
> This could work yeah. I didn't know these existed.

Ah, so they are conditional add, not add setting CC, so andcc wouldn't
be appropriate.

So I'm not sure how we'd handle such situation - maybe looking at
REG_DECL and recognizing a _Bool PARM_DECL is OK?

Richard.

> > So - what's the target and what's a testcase?
>
> See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html :)
>
> Thanks,
> Tamar
>
> 
> From: Richard Biener 
> Sent: Monday, September 26, 2022 11:35 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org ; nd ; 
> jeffreya...@gmail.com ; Richard Sandiford 
> 
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional 
> branches, give hint if argument is a truth type to backend
>
> On Fri, 23 Sep 2022, Tamar Christina wrote:
>
> > Hi All,
> >
> > This is an RFC to figure out how to deal with targets that don't have native
> > comparisons against QImode values.
> >
> > Booleans, at least in C99 and higher are 0-1 valued.  This means that we 
> > only
> > really need to test a single bit.  However in RTL we no longer have this
> > information available and just have an SImode value (due to the promotion of
> > QImode to SImode).
> >
> > This RFC fixes it by emitting an explicit & 1 during the expansion of the
> > conditional branch.
> >
> > However it's unlikely that we want to do this unconditionally.  Most targets
> > I've tested seem to have harmless code changes, like x86 changes from testb 
> > to
> > andl, $1.
> >
> > So I have two questions:
> >
> > 1. Should I limit this behind a target macro? Or should I just leave it for 
> > all
> >targets and deal with the fallout.
> > 2. How can I tell whether the C99 0-1 values bools are being used or the 
> > older
> >0, non-0 variant?
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > However there are some benign codegen changes on x86, testb changed to andl 
> > $1.
> >
> > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite 
> > common.
>
> How does this help a target?  Why does RTL nonzerop bits not recover this
> information and the desired optimization done later during combine
> for example?  Why's a SImode compare not OK if there's no QImode compare?
> We have u

Re: [PATCH] Avoid depending on destructor order

2022-09-26 Thread Thomas Neumann via Gcc-patches

Hi Claudiu,


This change prohibits compiling of ARC backend:


+  gcc_assert (in_shutdown || ob);


in_shutdown is only defined when ATOMIC_FDE_FAST_PATH is defined,
while gcc_assert is outside of any ifdef. Please can you revisit this
line and change it accordingly.


I have a patch ready, I am waiting for someone to approve my patch:

https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602130.html

Best

Thomas


Re: [PATCH] Avoid depending on destructor order

2022-09-26 Thread Claudiu Zissulescu Ianculescu via Gcc-patches
Thanks, I haven't observed it.

Waiting for it,
Claudiu

On Mon, Sep 26, 2022 at 2:49 PM Thomas Neumann  wrote:
>
> Hi Claudiu,
>
> > This change prohibits compiling of ARC backend:
> >
> >> +  gcc_assert (in_shutdown || ob);
> >
> > in_shutdown is only defined when ATOMIC_FDE_FAST_PATH is defined,
> > while gcc_assert is outside of any ifdef. Please can you revisit this
> > line and change it accordingly.
>
> I have a patch ready, I am waiting for someone to approve my patch:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602130.html
>
> Best
>
> Thomas


[PATCH] [PR107009] Set ranges from unreachable edges for all known ranges.

2022-09-26 Thread Aldy Hernandez via Gcc-patches
In the conversion of DOM+evrp to DOM+ranger, we missed that evrp was
exporting ranges for unreachable edges for all SSA names for which we
have ranges for.  Instead we have only been exporting ranges for the
SSA name in the final conditional to the BB involving the unreachable
edge.

This patch adjusts adjusts DOM to iterate over the exports, similarly
to what evrp was doing.

Note that I also noticed that we don't calculate the nonzero bit mask
for op1, when 0 = op1 & MASK.  This isn't needed for this PR,
since maybe_set_nonzero_bits() is chasing the definition and
parsing the bitwise and on its own.  However, I'll be adding the
functionality for completeness sake, plus we could probably drop the
maybe_set_nonzero_bits legacy call entirely.

Tested and benchmarked on x86-64 Linux.

I'm going to push this as soon as a final round of testing is done, as I
shuffled a few things (cleanups) at the last minute.

PR tree-optimization/107009

gcc/ChangeLog:

* tree-ssa-dom.cc
(dom_opt_dom_walker::set_global_ranges_from_unreachable_edges):
Iterate over exports.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr107009.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr107009.c | 15 ++
 gcc/tree-ssa-dom.cc  | 35 
 2 files changed, 33 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107009.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107009.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr107009.c
new file mode 100644
index 000..5010aed1723
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107009.c
@@ -0,0 +1,15 @@
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-dom2-alias" }
+
+typedef __SIZE_TYPE__ size_t;
+
+void saxpy(size_t n)
+{
+  if (n == 0 || n % 8 != 0)
+__builtin_unreachable();
+
+  extern void foobar (size_t n);
+  foobar (n);
+}
+
+// { dg-final { scan-tree-dump "NONZERO.*fff8" "dom2" } }
diff --git a/gcc/tree-ssa-dom.cc b/gcc/tree-ssa-dom.cc
index 513e0c88254..84bef798f52 100644
--- a/gcc/tree-ssa-dom.cc
+++ b/gcc/tree-ssa-dom.cc
@@ -1227,29 +1227,30 @@ void
 dom_opt_dom_walker::set_global_ranges_from_unreachable_edges (basic_block bb)
 {
   edge pred_e = single_pred_edge_ignoring_loop_edges (bb, false);
-
   if (!pred_e)
 return;
 
   gimple *stmt = last_stmt (pred_e->src);
+  if (!stmt
+  || gimple_code (stmt) != GIMPLE_COND
+  || !assert_unreachable_fallthru_edge_p (pred_e))
+return;
+
   tree name;
-  if (stmt
-  && gimple_code (stmt) == GIMPLE_COND
-  && (name = gimple_cond_lhs (stmt))
-  && TREE_CODE (name) == SSA_NAME
-  && assert_unreachable_fallthru_edge_p (pred_e)
-  && all_uses_feed_or_dominated_by_stmt (name, stmt))
-{
-  Value_Range r (TREE_TYPE (name));
+  gori_compute &gori = m_ranger->gori ();
+  FOR_EACH_GORI_EXPORT_NAME (gori, pred_e->src, name)
+if (all_uses_feed_or_dominated_by_stmt (name, stmt))
+  {
+   Value_Range r (TREE_TYPE (name));
 
-  if (m_ranger->range_on_edge (r, pred_e, name)
- && !r.varying_p ()
- && !r.undefined_p ())
-   {
- set_range_info (name, r);
- maybe_set_nonzero_bits (pred_e, name);
-   }
-}
+   if (m_ranger->range_on_edge (r, pred_e, name)
+   && !r.varying_p ()
+   && !r.undefined_p ())
+ {
+   set_range_info (name, r);
+   maybe_set_nonzero_bits (pred_e, name);
+ }
+  }
 }
 
 /* Record any equivalences created by the incoming edge to BB into
-- 
2.37.1



Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend

2022-09-26 Thread Richard Biener via Gcc-patches
On Mon, 26 Sep 2022, Tamar Christina wrote:

> > Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
> 
> But then I'd still need to change the expansion code. I suppose this could 
> prevent the issue with changes to code on other targets.
> 
> > > > We have undocumented addcc, negcc, etc. patterns, should we have aandcc 
> > > > pattern for this indicating support for andcc + jump as opposedto cmpcc 
> > > > + jump?
> > >
> > > This could work yeah. I didn't know these existed.
> 
> > Ah, so they are conditional add, not add setting CC, so andcc wouldn't
> > be appropriate.
> 
> > So I'm not sure how we'd handle such situation - maybe looking at
> > REG_DECL and recognizing a _Bool PARM_DECL is OK?
> 
> I have a slight suspicion that Richard Sandiford would likely reject this 
> though.. The additional AND seemed less hacky as it's just communicating 
> range.
> 
> I still need to also figure out which representation of bool is being used, 
> because only the 0-1 variant works. Is there a way to check that?

So another option would be, in case you have (subreg:SI (reg:QI)),
if we expand

 if (b != 0)

expand that to

 !((b & 255) == 0)

basically invert the comparison and the leverage the paradoxical subreg
to specify a narrower immediate to AND with?  Just hoping that arm
can do 255 as immediate and still efficiently handle this?

Wouldn't this transform be possible in combine with the appropriate
backend pattern and combine synthesizing the and for paradoxical subregs?

Richard.


Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend

2022-09-26 Thread Richard Biener via Gcc-patches
On Mon, 26 Sep 2022, Richard Biener wrote:

> On Mon, 26 Sep 2022, Tamar Christina wrote:
> 
> > > Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
> > 
> > But then I'd still need to change the expansion code. I suppose this could 
> > prevent the issue with changes to code on other targets.
> > 
> > > > > We have undocumented addcc, negcc, etc. patterns, should we have 
> > > > > aandcc pattern for this indicating support for andcc + jump as 
> > > > > opposedto cmpcc + jump?
> > > >
> > > > This could work yeah. I didn't know these existed.
> > 
> > > Ah, so they are conditional add, not add setting CC, so andcc wouldn't
> > > be appropriate.
> > 
> > > So I'm not sure how we'd handle such situation - maybe looking at
> > > REG_DECL and recognizing a _Bool PARM_DECL is OK?
> > 
> > I have a slight suspicion that Richard Sandiford would likely reject this 
> > though.. The additional AND seemed less hacky as it's just communicating 
> > range.
> > 
> > I still need to also figure out which representation of bool is being used, 
> > because only the 0-1 variant works. Is there a way to check that?
> 
> So another option would be, in case you have (subreg:SI (reg:QI)),
> if we expand
> 
>  if (b != 0)
> 
> expand that to
> 
>  !((b & 255) == 0)
> 
> basically invert the comparison and the leverage the paradoxical subreg
> to specify a narrower immediate to AND with?  Just hoping that arm
> can do 255 as immediate and still efficiently handle this?
> 
> Wouldn't this transform be possible in combine with the appropriate
> backend pattern and combine synthesizing the and for paradoxical subregs?

Looking at what we produce on aarch64 it seems 'bool' is using
an SImode register but your characterization that the upper 24 bits
have undefined content suggests that is a wrong representation?
If the ABI doesn't say anything about the upper bits we should
reflect that somehow?

Richard.


Re: [PATCH] Avoid depending on destructor order

2022-09-26 Thread Iain Sandoe via Gcc-patches



> On 26 Sep 2022, at 12:49, Thomas Neumann via Gcc-patches 
>  wrote:
> 
> Hi Claudiu,
> 
>> This change prohibits compiling of ARC backend:
>>> +  gcc_assert (in_shutdown || ob);
>> in_shutdown is only defined when ATOMIC_FDE_FAST_PATH is defined,
>> while gcc_assert is outside of any ifdef. Please can you revisit this
>> line and change it accordingly.
> 
> I have a patch ready, I am waiting for someone to approve my patch:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602130.html

You might also want to include Rainer’s patch,

AFAIR patches to fix bootstrap are allowed to proceed as an exception to
the usual rules,

Iain



[committed] libstdc++: Add #if around non-C++03 code in std::bitset [PR107037]

2022-09-26 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

PR libstdc++/107037
* include/std/bitset (_Base_bitset::_M_do_reset): Use
preprocessor conditional around non-C++03 code.
* testsuite/20_util/bitset/107037.cc: New test.
---
 libstdc++-v3/include/std/bitset | 5 +++--
 libstdc++-v3/testsuite/20_util/bitset/107037.cc | 7 +++
 2 files changed, 10 insertions(+), 2 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/20_util/bitset/107037.cc

diff --git a/libstdc++-v3/include/std/bitset b/libstdc++-v3/include/std/bitset
index 6dbc58c6429..1a551cf9785 100644
--- a/libstdc++-v3/include/std/bitset
+++ b/libstdc++-v3/include/std/bitset
@@ -182,13 +182,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _GLIBCXX14_CONSTEXPR void
   _M_do_reset() _GLIBCXX_NOEXCEPT
   {
+#if __cplusplus >= 201402L
if (__builtin_is_constant_evaluated())
  {
for (_WordT& __w : _M_w)
  __w = 0;
return;
  }
-
+#endif
__builtin_memset(_M_w, 0, _Nw * sizeof(_WordT));
   }
 
@@ -1680,7 +1681,7 @@ _GLIBCXX_END_NAMESPACE_VERSION
 
 #endif // C++11
 
-#ifdef _GLIBCXX_DEBUG && _GLIBCXX_HOSTED
+#if defined _GLIBCXX_DEBUG && _GLIBCXX_HOSTED
 # include 
 #endif
 
diff --git a/libstdc++-v3/testsuite/20_util/bitset/107037.cc 
b/libstdc++-v3/testsuite/20_util/bitset/107037.cc
new file mode 100644
index 000..b4560dd3775
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/bitset/107037.cc
@@ -0,0 +1,7 @@
+// { dg-options "-std=c++03" }
+// { dg-do compile }
+// PR libstdc++/107037 bitset::_M_do_reset fails for strict -std=c++03 mode
+#include 
+template class std::bitset<0>;
+template class std::bitset<1>;
+template class std::bitset<100>;
-- 
2.37.3



Re: [PATCH] Avoid depending on destructor order

2022-09-26 Thread Thomas Neumann via Gcc-patches

Hi Iain,


You might also want to include Rainer’s patch,

AFAIR patches to fix bootstrap are allowed to proceed as an exception to
the usual rules,


I was not aware of that. I have pushed the patch below now (including 
Rainer's change), I will update the code if requested.


Best

Thomas

fix assert in __deregister_frame_info_bases

When using the atomic fast path deregistering can fail during
program shutdown if the lookup structures are already destroyed.
The assert in __deregister_frame_info_bases takes that into
account. In the non-fast-path case however is not aware of
program shutdown, which caused a compiler error on such platforms.
We fix that by introducing a constant for in_shutdown in
non-fast-path builds.
We also drop the destructor priority, as it is not supported on
all platforms and we no longer rely upon the priority anyway.

libgcc/ChangeLog:
* unwind-dw2-fde.c: Introduce a constant for in_shutdown
for the non-fast-path case. Drop destructor priority.

diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
index d237179f4ea..3c0cc654ec0 100644
--- a/libgcc/unwind-dw2-fde.c
+++ b/libgcc/unwind-dw2-fde.c
@@ -51,7 +51,7 @@ static struct btree registered_frames;
 static bool in_shutdown;

 static void
-release_registered_frames (void) __attribute__ ((destructor (110)));
+release_registered_frames (void) __attribute__ ((destructor));
 static void
 release_registered_frames (void)
 {
@@ -67,6 +67,8 @@ static void
 init_object (struct object *ob);

 #else
+/* Without fast path frame deregistration must always succeed.  */
+static const int in_shutdown = 0;

 /* The unseen_objects list contains objects that have been registered
but not yet categorized in any way.  The seen_objects list has had


RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend

2022-09-26 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: Richard Biener 
> Sent: Monday, September 26, 2022 1:43 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; jeffreya...@gmail.com;
> Richard Sandiford 
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> branches, give hint if argument is a truth type to backend
> 
> On Mon, 26 Sep 2022, Richard Biener wrote:
> 
> > On Mon, 26 Sep 2022, Tamar Christina wrote:
> >
> > > > Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
> > >
> > > But then I'd still need to change the expansion code. I suppose this could
> prevent the issue with changes to code on other targets.
> > >
> > > > > > We have undocumented addcc, negcc, etc. patterns, should we
> have aandcc pattern for this indicating support for andcc + jump as
> opposedto cmpcc + jump?
> > > > >
> > > > > This could work yeah. I didn't know these existed.
> > >
> > > > Ah, so they are conditional add, not add setting CC, so andcc
> > > > wouldn't be appropriate.
> > >
> > > > So I'm not sure how we'd handle such situation - maybe looking at
> > > > REG_DECL and recognizing a _Bool PARM_DECL is OK?
> > >
> > > I have a slight suspicion that Richard Sandiford would likely reject this
> though.. The additional AND seemed less hacky as it's just communicating
> range.
> > >
> > > I still need to also figure out which representation of bool is being 
> > > used,
> because only the 0-1 variant works. Is there a way to check that?
> >
> > So another option would be, in case you have (subreg:SI (reg:QI)), if
> > we expand
> >
> >  if (b != 0)
> >
> > expand that to
> >
> >  !((b & 255) == 0)
> >
> > basically invert the comparison and the leverage the paradoxical
> > subreg to specify a narrower immediate to AND with?  Just hoping that
> > arm can do 255 as immediate and still efficiently handle this?

We can and already do, and don't need that representation to do so.
The problem is, handling 255 is already inefficient. It requires us to use an 
additional
Instruction to test the value. Whereas we have a fused test single bit and 
branch instruction.

> >
> > Wouldn't this transform be possible in combine with the appropriate
> > backend pattern and combine synthesizing the and for paradoxical
> subregs?

Not unless we have enough range information in RTL to know that whatever value 
has
been fed into the cbranch has a range of 1 bit. A range of 8 bits we already 
have and isn't value useful.

The idea was to transform what we currently have:

tst w0, 255
bne .L4
ret

i.e. test the bottom 8 bits, into

tbnzw0, #0, .L4
ret

i.e. test only bit 0 and branch based on that bit. We cannot do this when all 
we know is that the range is 8 bits.

> 
> Looking at what we produce on aarch64 it seems 'bool' is using an SImode
> register but your characterization that the upper 24 bits have undefined
> content suggests that is a wrong representation?
> If the ABI doesn't say anything about the upper bits we should reflect that
> somehow?

It does. And no "bool" is using QImode. The expansion of

extern void h ();

void g1(bool x)
{
  if (__builtin_expect (x, 0))
h ();
}

Shows that the argument x is passed as a QI mode, but like many RISC targets 
(and even i386) we promote the argument during expansion:

(insn 2 4 3 2 (set (reg/v:SI 92 [ x ])
(zero_extend:SI (reg:QI 0 x0 [ x ]))) "/app/example.cpp":4:1 -1
 (nil))

But the value is passed as QImode.

We use this fact to know that the range is 8 bits in the cbanch instruction.  
If no operation was done that requires a bigger
range then combine will push the zero extend into the cbranch and we have 
various patterns to handle different forms of this.

For instance:

void g1(bool *x)
{
  if (__builtin_expect (*x, 0))
h ();
}

Because of the load of x we generate:

ldrbw0, [x0]
cbnzw0, .L7
ret

because we know the top bits are defined to 0 in this case and can just test 
the entire register.

The reason for this promotion for us and many other backends is one of 
efficiency. If we don't promote to something
we have native instructions for we would have to promote and demote the value 
at *every* instruction in RTL.

This causes significant noise in the RTL.  So we can't do anything different 
here.  I have plans to try to fix this, but not in GCC 13.

But even then it won't help with this case, because we explicitly need to know 
that the range is a single bit. Not 8 bits.

Regards,
Tamar

> 
> Richard.


Re: [PATCH] c++ modules: ICE with class NTTP argument [PR100616]

2022-09-26 Thread Nathan Sidwell via Gcc-patches

On 9/23/22 09:32, Patrick Palka wrote:


Judging by the two commits that introduced/modified this part of
maybe_register_incomplete_var, r196852 and r214333, ISTM the code
is really only concerned with constexpr static data members (whose
initializer may contain a pointer-to-member for a currently open class).
So maybe we ought to restrict the branch like so, which effectively
disables this part of maybe_register_incomplete_var during stream-in, and
guarantees that outermost_open_class doesn't return NULL if the branch is
taken?


I think the problem is that we're streaming these VAR_DECLs as regular 
VAR_DECLS, when we should be handling them as a new kind of object 
fished out from the template they're instantiating. (I'm guessing 
that'll just be a new tag, a type and an initializer?)


Then on stream-in we can handle them in the same way as a non-modules 
compilation handles such redeclarations.  I.e. how does:


template struct C { };
struct A { };
C c1; // #1
C c2; // #2

work.  Presumably at some point #2's A{} gets unified such that we find 
the instantation that occurred at #1?


I notice the template arg for C is a var decl mangled as 
_ZTAXtl1AEE, which is a 'template paramete object for A{}'.  I see 
that's a special mangler 'mangle_template_parm_object', called from 
get_template_parm_object.  Perhaps these VAR_DECLs need an additional 
in-tree flag that the streamer can check for?


nathan
--
Nathan Sidwell



VN, len_store and endianness

2022-09-26 Thread Robin Dapp via Gcc-patches
Hi,

I'm locally testing a branch that enables vll/vstl for partial vector
usage i.e. len_load and len_store on s390.  I see a FAIL in
testsuite/gfortran.dg/power_3.f90.
Since r13-1777-gbd9837bc3ca134 we also performe VN for masked/len stores
and things go wrong there.  The problem seems to be that we evaluate a
vector constant {-1, 1, -1, 1} loaded with length 11 + 1(bias) = 12 as
{1, -1, 1} instead of {-1, 1, -1}.

I found it a bit difficult to navigate through the logic due to several
sizes, offsets, lengths and "amounts" :)  From what I can tell the
culprit code is (guarded by BYTES_BIG_ENDIAN)

   if (TREE_CODE (pd.rhs) != CONSTRUCTOR)
 {
 q = (this_buffer + len
  - (ROUND_UP (size - amnt, BITS_PER_UNIT)
 / BITS_PER_UNIT));
 }

where, with pd.rhs = { 255, 255, 255, 255, 0, 0, 0, 1, 255, 255, 255,
255, 0, 0, 0, 1 }, len = 16 bytes, size = 96 bits, we read after the
first 32 bits.  What is supposed to happen here?  It looks like going
backwards (when size grows), but actually size shrinks for my example
with each successive element via pd.offset 0, -32 and -64.

When skipping the block with && TREE_CODE (pd.rhs) != VECTOR_CST the
test and various others succeed but I didn't pursue testing further and
figured I'd rather ask here for more insight.

Regards
 Robin


[PATCH] c++ modules: variable template partial spec fixes [PR107033]

2022-09-26 Thread Patrick Palka via Gcc-patches
In r13-2775-g32d8123cd6ce87 I overlooked that we need to adjust the
call to add_mergeable_specialization in the MK_partial case to correctly
handle variable template partial specializations (it currently assumes
we're always dealing with one for a class template).  This fixes an ICE
when converting the testcase from that commit to use an importable
header instead of a named module.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

PR c++/107033

gcc/cp/ChangeLog:

* module.cc (trees_in::decl_value): In the MK_partial case
for a variable template partial specialization, pass decl_p=true
to add_mergeable_specialization and set spec to the VAR_DECL
not the TEMPLATE_DECL.
* pt.cc (add_mergeable_specialization): For a variable template
partial specialization, set the TREE_TYPE of the new
DECL_TEMPLATE_SPECIALIZATIONS node to the TREE_TYPE of the
VAR_DECL not the VAR_DECL itself.

gcc/testsuite/ChangeLog:

* g++.dg/modules/partial-2.cc, g++.dg/modules/partial-2.h: New
files, factored out from ...
* g++.dg/modules/partial-2_a.C, g++.dg/modules/partial-2_b.C: ...
here.
* g++.dg/modules/partial-2_c.H: New test.
* g++.dg/modules/partial-2_d.C: New test.
---
 gcc/cp/module.cc   | 17 ++
 gcc/cp/pt.cc   |  2 +-
 gcc/testsuite/g++.dg/modules/partial-2.cc  | 17 ++
 gcc/testsuite/g++.dg/modules/partial-2.h   | 38 +
 gcc/testsuite/g++.dg/modules/partial-2_a.C | 39 +-
 gcc/testsuite/g++.dg/modules/partial-2_b.C | 18 +-
 gcc/testsuite/g++.dg/modules/partial-2_c.H |  5 +++
 gcc/testsuite/g++.dg/modules/partial-2_d.C |  8 +
 8 files changed, 82 insertions(+), 62 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/partial-2.cc
 create mode 100644 gcc/testsuite/g++.dg/modules/partial-2.h
 create mode 100644 gcc/testsuite/g++.dg/modules/partial-2_c.H
 create mode 100644 gcc/testsuite/g++.dg/modules/partial-2_d.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index f23832cb56a..7496df5e843 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -8185,13 +8185,18 @@ trees_in::decl_value ()
/* Set the TEMPLATE_DECL's type.  */
TREE_TYPE (decl) = TREE_TYPE (inner);
 
-  if (mk & MK_template_mask
- || mk == MK_partial)
+  /* Add to specialization tables now that constraints etc are
+added.  */
+  if (mk == MK_partial)
{
- /* Add to specialization tables now that constraints etc are
-added.  */
- bool is_type = mk == MK_partial || !(mk & MK_tmpl_decl_mask);
-
+ bool is_type = TREE_CODE (inner) == TYPE_DECL;
+ spec.spec = is_type ? type : inner;
+ add_mergeable_specialization (!is_type, false,
+   &spec, decl, spec_flags);
+   }
+  else if (mk & MK_template_mask)
+   {
+ bool is_type = !(mk & MK_tmpl_decl_mask);
  spec.spec = is_type ? type : mk & MK_tmpl_tmpl_mask ? inner : decl;
  add_mergeable_specialization (!is_type,
!is_type && mk & MK_tmpl_alias_mask,
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index db4e808adec..1f088fe281e 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -31010,7 +31010,7 @@ add_mergeable_specialization (bool decl_p, bool 
alias_p, spec_entry *elt,
   /* A partial specialization.  */
   tree cons = tree_cons (elt->args, decl,
 DECL_TEMPLATE_SPECIALIZATIONS (elt->tmpl));
-  TREE_TYPE (cons) = elt->spec;
+  TREE_TYPE (cons) = decl_p ? TREE_TYPE (elt->spec) : elt->spec;
   DECL_TEMPLATE_SPECIALIZATIONS (elt->tmpl) = cons;
 }
 }
diff --git a/gcc/testsuite/g++.dg/modules/partial-2.cc 
b/gcc/testsuite/g++.dg/modules/partial-2.cc
new file mode 100644
index 000..1316bf5e1c5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/partial-2.cc
@@ -0,0 +1,17 @@
+static_assert(is_reference_v);
+static_assert(is_reference_v);
+static_assert(!is_reference_v);
+
+static_assert(A::is_reference_v);
+static_assert(A::is_reference_v);
+static_assert(!A::is_reference_v);
+
+#if __cpp_concepts
+static_assert(concepts::is_reference_v);
+static_assert(concepts::is_reference_v);
+static_assert(!concepts::is_reference_v);
+
+static_assert(concepts::A::is_reference_v);
+static_assert(concepts::A::is_reference_v);
+static_assert(!concepts::A::is_reference_v);
+#endif
diff --git a/gcc/testsuite/g++.dg/modules/partial-2.h 
b/gcc/testsuite/g++.dg/modules/partial-2.h
new file mode 100644
index 000..afcfce791b3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/partial-2.h
@@ -0,0 +1,38 @@
+template constexpr bool is_reference_v = false;
+template constexpr bool is_reference_v = true;
+template constexpr bool is_reference_v = true;
+
+struct A {
+  template static constexpr bool is_reference_v = false;
+};
+
+templat

Re: [PATCH] c++ modules: ICE with class NTTP argument [PR100616]

2022-09-26 Thread Nathan Sidwell via Gcc-patches

On 9/26/22 10:08, Nathan Sidwell wrote:

On 9/23/22 09:32, Patrick Palka wrote:


Judging by the two commits that introduced/modified this part of
maybe_register_incomplete_var, r196852 and r214333, ISTM the code
is really only concerned with constexpr static data members (whose
initializer may contain a pointer-to-member for a currently open class).
So maybe we ought to restrict the branch like so, which effectively
disables this part of maybe_register_incomplete_var during stream-in, and
guarantees that outermost_open_class doesn't return NULL if the branch is
taken?


I think the problem is that we're streaming these VAR_DECLs as regular 
VAR_DECLS, when we should be handling them as a new kind of object 
fished out from the template they're instantiating. (I'm guessing 
that'll just be a new tag, a type and an initializer?)


Then on stream-in we can handle them in the same way as a non-modules 
compilation handles such redeclarations.  I.e. how does:


template struct C { };
struct A { };
C c1; // #1
C c2; // #2

work.  Presumably at some point #2's A{} gets unified such that we find 
the instantation that occurred at #1?


I notice the template arg for C is a var decl mangled as 
_ZTAXtl1AEE, which is a 'template paramete object for A{}'.  I see 
that's a special mangler 'mangle_template_parm_object', called from 
get_template_parm_object.  Perhaps these VAR_DECLs need an additional 
in-tree flag that the streamer can check for?


I wonder if we're setting the module attachment for these variables 
sanely? They should be attached to the global module.  My guess is the 
pushdecl_top_level_and_finish call in get_templatE_parm_object is not 
doing what is needed (as well as the other issues).



--
Nathan Sidwell



Re: [PATCH] c++ modules: variable template partial spec fixes [PR107033]

2022-09-26 Thread Nathan Sidwell via Gcc-patches

On 9/26/22 10:36, Patrick Palka wrote:

In r13-2775-g32d8123cd6ce87 I overlooked that we need to adjust the
call to add_mergeable_specialization in the MK_partial case to correctly
handle variable template partial specializations (it currently assumes
we're always dealing with one for a class template).  This fixes an ICE
when converting the testcase from that commit to use an importable
header instead of a named module.



looks good, thanks



Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

PR c++/107033

gcc/cp/ChangeLog:

* module.cc (trees_in::decl_value): In the MK_partial case
for a variable template partial specialization, pass decl_p=true
to add_mergeable_specialization and set spec to the VAR_DECL
not the TEMPLATE_DECL.
* pt.cc (add_mergeable_specialization): For a variable template
partial specialization, set the TREE_TYPE of the new
DECL_TEMPLATE_SPECIALIZATIONS node to the TREE_TYPE of the
VAR_DECL not the VAR_DECL itself.

gcc/testsuite/ChangeLog:

* g++.dg/modules/partial-2.cc, g++.dg/modules/partial-2.h: New
files, factored out from ...
* g++.dg/modules/partial-2_a.C, g++.dg/modules/partial-2_b.C: ...
here.
* g++.dg/modules/partial-2_c.H: New test.
* g++.dg/modules/partial-2_d.C: New test.
---
  gcc/cp/module.cc   | 17 ++
  gcc/cp/pt.cc   |  2 +-
  gcc/testsuite/g++.dg/modules/partial-2.cc  | 17 ++
  gcc/testsuite/g++.dg/modules/partial-2.h   | 38 +
  gcc/testsuite/g++.dg/modules/partial-2_a.C | 39 +-
  gcc/testsuite/g++.dg/modules/partial-2_b.C | 18 +-
  gcc/testsuite/g++.dg/modules/partial-2_c.H |  5 +++
  gcc/testsuite/g++.dg/modules/partial-2_d.C |  8 +
  8 files changed, 82 insertions(+), 62 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/partial-2.cc
  create mode 100644 gcc/testsuite/g++.dg/modules/partial-2.h
  create mode 100644 gcc/testsuite/g++.dg/modules/partial-2_c.H
  create mode 100644 gcc/testsuite/g++.dg/modules/partial-2_d.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index f23832cb56a..7496df5e843 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -8185,13 +8185,18 @@ trees_in::decl_value ()
/* Set the TEMPLATE_DECL's type.  */
TREE_TYPE (decl) = TREE_TYPE (inner);
  
-  if (mk & MK_template_mask

- || mk == MK_partial)
+  /* Add to specialization tables now that constraints etc are
+added.  */
+  if (mk == MK_partial)
{
- /* Add to specialization tables now that constraints etc are
-added.  */
- bool is_type = mk == MK_partial || !(mk & MK_tmpl_decl_mask);
-
+ bool is_type = TREE_CODE (inner) == TYPE_DECL;
+ spec.spec = is_type ? type : inner;
+ add_mergeable_specialization (!is_type, false,
+   &spec, decl, spec_flags);
+   }
+  else if (mk & MK_template_mask)
+   {
+ bool is_type = !(mk & MK_tmpl_decl_mask);
  spec.spec = is_type ? type : mk & MK_tmpl_tmpl_mask ? inner : decl;
  add_mergeable_specialization (!is_type,
!is_type && mk & MK_tmpl_alias_mask,
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index db4e808adec..1f088fe281e 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -31010,7 +31010,7 @@ add_mergeable_specialization (bool decl_p, bool 
alias_p, spec_entry *elt,
/* A partial specialization.  */
tree cons = tree_cons (elt->args, decl,
 DECL_TEMPLATE_SPECIALIZATIONS (elt->tmpl));
-  TREE_TYPE (cons) = elt->spec;
+  TREE_TYPE (cons) = decl_p ? TREE_TYPE (elt->spec) : elt->spec;
DECL_TEMPLATE_SPECIALIZATIONS (elt->tmpl) = cons;
  }
  }
diff --git a/gcc/testsuite/g++.dg/modules/partial-2.cc 
b/gcc/testsuite/g++.dg/modules/partial-2.cc
new file mode 100644
index 000..1316bf5e1c5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/partial-2.cc
@@ -0,0 +1,17 @@
+static_assert(is_reference_v);
+static_assert(is_reference_v);
+static_assert(!is_reference_v);
+
+static_assert(A::is_reference_v);
+static_assert(A::is_reference_v);
+static_assert(!A::is_reference_v);
+
+#if __cpp_concepts
+static_assert(concepts::is_reference_v);
+static_assert(concepts::is_reference_v);
+static_assert(!concepts::is_reference_v);
+
+static_assert(concepts::A::is_reference_v);
+static_assert(concepts::A::is_reference_v);
+static_assert(!concepts::A::is_reference_v);
+#endif
diff --git a/gcc/testsuite/g++.dg/modules/partial-2.h 
b/gcc/testsuite/g++.dg/modules/partial-2.h
new file mode 100644
index 000..afcfce791b3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/partial-2.h
@@ -0,0 +1,38 @@
+template constexpr bool is_reference_v = false;
+template constexpr bool is_reference_v = true;
+template constexpr bool is_reference_v = true;
+

Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

2022-09-26 Thread Tobias Burnus

Hi Alexander,

On 21.09.22 22:06, Alexander Monakov wrote:

It also goes
against the good practice of accelerator programming, which requires queueing
work on the accelerator and letting it run asynchronously with the CPU with high
occupancy.
(I know libgomp still waits for the GPU to finish in each GOMP_offload_run,
but maybe it's better to improve *that* instead of piling on new slowness)


Doesn't OpenMP 'nowait' permit this? (+ 'depend' clause if needed).


On to the patch itself.



And for non-USM code path you're relying on cudaMemcpy observing device-side
atomics in the right order.
Atomics aside, CUDA pinned memory would be a natural choice for such a tiny
structure. Did you rule it out for some reason?


I did use pinned memory (cuMemAllocHost) – but somehow it did escape me
that:

"All host memory allocated in all contexts using cuMemAllocHost() and
cuMemHostAlloc() is always directly accessible from all contexts on all
devices that support unified addressing."

I have now updated (but using cuMemHostAlloc instead, using a flag in
the hope that this choice is a tad faster).


+++ b/libgomp/config/nvptx/target.c
...
+#define GOMP_REV_OFFLOAD_VAR __gomp_rev_offload_var

Shouldn't this be in a header (needs to be in sync with the plugin).

I have now created one.

+
+#if (__SIZEOF_SHORT__ != 2 \
+ || __SIZEOF_SIZE_T__ != 8 \
+ || __SIZEOF_POINTER__ != 8)
+#error "Data-type conversion required for rev_offload"
+#endif

Huh? This is not a requirement that is new for reverse offload, it has always
been like that for offloading (all ABI rules regarding type sizes, struct
layout, bitfield layout, endianness must match).


In theory, compiling with "-m32 -foffload-options=-m64" or "-m32
-foffload-options=-m32" or "-m64 -foffload-options=-m32" is supported.
In practice, -m64 everywhere is required. I just want to make sure that
for this code the sizes are fine because, here, I am sure it breaks. For
other parts, I think the 64bit assumption is coded in but I am not
completely sure that's really the case.


+  if (device != GOMP_DEVICE_HOST_FALLBACK
+  || fn == NULL
+  || GOMP_REV_OFFLOAD_VAR == NULL)
+return;

Shouldn't this be an 'assert' instead?


This tries to mimic what was there before – doing nothing. In any case,
this code path is unspecified or implementation defined (I forgot which
of the two), but a user might still be able to construct such a code.

I leave it to Jakub whether he likes to have an assert, a error/warning
message, or just the return here.


+  __atomic_store_n (&GOMP_REV_OFFLOAD_VAR->dev_num,
+GOMP_ADDITIONAL_ICVS.device_num, __ATOMIC_SEQ_CST);

Looks like all these can be plain stores, you only need ...


+  __atomic_store_n (&GOMP_REV_OFFLOAD_VAR->fn, fn, __ATOMIC_SEQ_CST);

... this to be atomic with 'release' semantics in the usual producer-consumer
pattern.


+  if (ptx_dev->rev_data->fn != 0)

Surely this needs to be an atomic load with 'acquire' semantics in has_usm case?

+rev_data->fn = 0;

Atomic store?


Done so – updated patch attached. Thanks for the comments.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp/nvptx: Prepare for reverse-offload callback handling

This patch adds a stub 'gomp_target_rev' in the host's target.c, which will
later handle the reverse offload.
For nvptx, it adds support for forwarding the offload gomp_target_ext call
to the host by setting values in a struct on the device and querying it on
the host - invoking gomp_target_rev on the result.

include/ChangeLog:

	* cuda/cuda.h (enum CUdevice_attribute): Add
	CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING.
	(cuMemHostAlloc): Add prototype.

libgomp/ChangeLog:

	* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Remove
	'static' for this variable.
	* config/nvptx/libgomp-nvptx.h: New file.
	* config/nvptx/target.c: Include it.
	(GOMP_ADDITIONAL_ICVS): Declare extern var.
	(GOMP_REV_OFFLOAD_VAR): Declare var.
	(GOMP_target_ext): Handle reverse offload.
	* libgomp-plugin.h (GOMP_PLUGIN_target_rev): New prototype.
	* libgomp-plugin.c (GOMP_PLUGIN_target_rev): New, call ...
	* target.c (gomp_target_rev): ... this new stub function.
	* libgomp.h (gomp_target_rev): Declare.
	* libgomp.map (GOMP_PLUGIN_1.4): New; add GOMP_PLUGIN_target_rev.
	* plugin/cuda-lib.def (cuMemHostAlloc): Add.
	* plugin/plugin-nvptx.c: Include libgomp-nvptx.h.
	(struct ptx_device): Add rev_data member. 
	(nvptx_open_device): #if 0 unused check; add
	unified address assert check.
	(GOMP_OFFLOAD_get_num_devices): Claim unified address
	support.
	(GOMP_OFFLOAD_load_image): Free rev_fn_table if no
	offload functions exist. Make offload var available
	on host and device.
	(rev_off_dev_to_host_cpy, rev_off_host_to_dev_cpy): New.
	(GOMP_OFFLOAD_run): Handle reverse offl

Update my email address and DCO entry in MAINTAINERS file

2022-09-26 Thread Jeff Law


Committed to the trunk.


commit 1b5432b401934962affe32cd7e42e864224e8062
Author: Jeff Law 
Date:   Mon Sep 26 09:14:55 2022 -0600

Update my address and DCO entry in MAINTAINERS file

/
* MAINTAINERS: Update my email address and DCO entry.

diff --git a/MAINTAINERS b/MAINTAINERS
index f63de226609..11fa8bc6dbd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -30,7 +30,7 @@ Richard Biener

 Richard Earnshaw   
 Jakub Jelinek  
 Richard Kenner 
-Jeff Law   
+Jeff Law   
 Michael Meissner   
 Jason Merrill  
 David S. Miller
@@ -725,6 +725,7 @@ Matthias Kretz  

 Tim Lange  
 Jeff Law   
 Jeff Law   
+Jeff Law   
 Immad Mir  
 Gaius Mulley   
 Siddhesh Poyarekar 


Update for gcc steering committee page

2022-09-26 Thread Jeff Law

Updates my affiliation on the web pages.


Committed to the trunk.


Jeff
commit 57e71fb18e8fa397336266f105a22f45f0fa7704
Author: Jeff Law 
Date:   Mon Sep 26 09:19:36 2022 -0600

Update my affiliation on the steering committee page.

diff --git a/htdocs/steering.html b/htdocs/steering.html
index 28ca29fe..21d2f715 100644
--- a/htdocs/steering.html
+++ b/htdocs/steering.html
@@ -31,7 +31,7 @@ place to reach them is the gcc mailing 
list.
 
 David Edelsohn (IBM)
 Kaveh R. Ghazi
-Jeffrey A. Law (Tachyum)
+Jeffrey A. Law (Ventana Micro Systems)
 Marc Lehmann (nethype GmbH)
 Jason Merrill (Red Hat)
 David Miller (Red Hat)


[PATCH] c++: Instantiate less when evaluating __is_convertible

2022-09-26 Thread Marek Polacek via Gcc-patches
Jon reported that evaluating __is_convertible in this test leads to
instantiating char_traits::eq, which is invalid (because we
are trying to call a member function on a char) and so we fail to
compile the test.  __is_convertible doesn't and shouldn't need to
instantiate so much, so let's limit it with a cp_unevaluated guard.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/106784

gcc/cp/ChangeLog:

* method.cc (is_convertible): Use cp_unevaluated.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_convertible3.C: New test.
---
 gcc/cp/method.cc   |  1 +
 gcc/testsuite/g++.dg/ext/is_convertible3.C | 66 ++
 2 files changed, 67 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_convertible3.C

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index c35a59fe56c..45f70f5d3f3 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -2246,6 +2246,7 @@ is_convertible (tree from, tree to)
 {
   if (VOID_TYPE_P (from) && VOID_TYPE_P (to))
 return true;
+  cp_unevaluated u;
   tree expr = build_stub_object (from);
   expr = perform_implicit_conversion (to, expr, tf_none);
   if (expr == error_mark_node)
diff --git a/gcc/testsuite/g++.dg/ext/is_convertible3.C 
b/gcc/testsuite/g++.dg/ext/is_convertible3.C
new file mode 100644
index 000..c817dc6f146
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_convertible3.C
@@ -0,0 +1,66 @@
+// PR c++/106784
+// { dg-do compile { target c++11 } }
+// Make sure we don't reject this at runtime by trying to instantiate
+// char_traits::eq(CharT, CharT) while evaluating __is_convertible.
+
+template
+struct bool_constant { static constexpr bool value = B; };
+using true_type = bool_constant;
+using false_type = bool_constant;
+
+template struct is_void : false_type { };
+template<> struct is_void : true_type { };
+
+template T&& declval();
+
+template struct enable_if { };
+template<> struct enable_if { using type = void; };
+template using enable_if_t = typename enable_if::type;
+
+template
+  struct is_convertible
+  : public bool_constant<__is_convertible(_From, _To)>
+  { };
+
+template
+struct char_traits
+{
+  static unsigned long length(const char* s) { eq(*s, *s); return 0; }
+
+  static void eq(CharT l, CharT r) noexcept { l.f(r); }
+};
+
+template
+struct basic_string_view
+{
+  using traits_type = char_traits;
+
+  constexpr basic_string_view() = default;
+  constexpr basic_string_view(const basic_string_view&) = default;
+
+  constexpr
+  basic_string_view(const CharT* __str) noexcept
+  : _M_len{traits_type::length(__str)}
+  { }
+
+  unsigned long _M_len = 0;
+};
+
+template
+struct basic_string
+{
+  template
+enable_if_t>::value
+&& !is_convertible::value>
+replace(int, T) { }
+
+  void replace(unsigned long, const char*) { }
+
+  void replace(const char* s) { replace(1, s); }
+};
+
+int main()
+{
+  basic_string s;
+  s.replace("");
+}

base-commit: 2460f7cdef7ef9c971de79271afc0db73687a272
-- 
2.37.3



Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-09-26 Thread Andre Vieira (lists) via Gcc-patches


On 08/09/2022 12:51, Richard Biener wrote:


I'm curious, why the push to redundant_ssa_names?  That could use
a comment ...
So I purposefully left a #if 0 #else #endif in there so you can see the 
two options. But the reason I used redundant_ssa_names is because ifcvt 
seems to use that as a container for all pairs of (old, new) ssa names 
to replace later. So I just piggy backed on that. I don't know if 
there's a specific reason they do the replacement at the end? Maybe some 
ordering issue? Either way both adding it to redundant_ssa_names or 
doing the replacement inline work for the bitfield lowering (or work in 
my testing at least).

Note I fear we will have endianess issues when translating
bit-field accesses to BIT_FIELD_REF/INSERT and then to shifts.  Rules
for memory and register operations do not match up (IIRC, I repeatedly
run into issues here myself).  The testcases all look like they
won't catch this - I think an example would be sth like
struct X { unsigned a : 23; unsigned b : 9; }, can you see to do
testing on a big-endian target?
I've done some testing and you were right, it did fall apart on 
big-endian. I fixed it by changing the way we compute the 'shift' value 
and added two extra testcases for read and write each.


Sorry for the delay in reviewing.
No worries, apologies myself for the delay in reworking this, had a nice 
little week holiday in between :)


I'll write the ChangeLogs once the patch has stabilized.

Thanks,
Andre
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
new file mode 100644
index 
..01cf34fb44484ca926ca5de99eef76dd99b69e92
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
@@ -0,0 +1,40 @@
+/* { dg-require-effective-target vect_int } */
+
+#include 
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s { int i : 31; };
+
+#define ELT0 {0}
+#define ELT1 {1}
+#define ELT2 {2}
+#define ELT3 {3}
+#define N 32
+#define RES 48
+struct s A[N]
+  = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3};
+
+int __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+int res = 0;
+for (int i = 0; i < n; ++i)
+  res += ptr[i].i;
+return res;
+}
+
+int main (void)
+{
+  check_vect ();
+
+  if (f(&A[0], N) != RES)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
new file mode 100644
index 
..1a4a1579c1478b9407ad21b19e8fbdca9f674b42
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
@@ -0,0 +1,43 @@
+/* { dg-require-effective-target vect_int } */
+
+#include 
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s {
+unsigned i : 31;
+char a : 4;
+};
+
+#define N 32
+#define ELT0 {0x7FFFUL, 0}
+#define ELT1 {0x7FFFUL, 1}
+#define ELT2 {0x7FFFUL, 2}
+#define ELT3 {0x7FFFUL, 3}
+#define RES 48
+struct s A[N]
+  = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3};
+
+int __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+int res = 0;
+for (int i = 0; i < n; ++i)
+  res += ptr[i].a;
+return res;
+}
+
+int main (void)
+{
+  check_vect ();
+
+  if (f(&A[0], N) != RES)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
new file mode 100644
index 
..216611a29fd8bbfbafdbdb79d790e520f44ba672
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
@@ -0,0 +1,43 @@
+/* { dg-require-effective-target vect_int } */
+
+#include 
+#include "tree-vect.h"
+#include 
+
+extern void abort(void);
+
+typedef struct {
+int  c;
+int  b;
+bool a : 1;
+} struct_t;
+
+#define N 16
+#define ELT_F { 0x, 0x, 0 }
+#define ELT_T { 0x, 0x, 1 }
+
+struct_t vect_false[N] = { ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, 
ELT_F,
+  ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, 
ELT_F  };
+struct_t vect_true[N]  = { ELT_F, ELT_F, ELT_T, ELT_F, ELT_F, ELT_F, ELT_F, 
ELT_F,
+  ELT_F, ELT_F, ELT_T, ELT_F, ELT_F, ELT_F, ELT_F, 
ELT_F  };
+int main (void)
+{
+  unsigned ret = 0;
+  for (unsigned i = 0; i < N; i++)
+  {
+  ret |= vect_false[i].a;
+  }
+  if (ret)
+abort ();
+
+  for (unsigned i = 0; i < N; i++)
+  {
+  ret |= vect_true[i

Re: [PATCH 1/2]middle-end Fold BIT_FIELD_REF and Shifts into BIT_FIELD_REFs alone

2022-09-26 Thread Andrew Pinski via Gcc-patches
On Sun, Sep 25, 2022 at 9:56 PM Tamar Christina  wrote:
>
> > -Original Message-
> > From: Andrew Pinski 
> > Sent: Saturday, September 24, 2022 8:57 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de
> > Subject: Re: [PATCH 1/2]middle-end Fold BIT_FIELD_REF and Shifts into
> > BIT_FIELD_REFs alone
> >
> > On Fri, Sep 23, 2022 at 4:43 AM Tamar Christina via Gcc-patches  > patc...@gcc.gnu.org> wrote:
> > >
> > > Hi All,
> > >
> > > This adds a match.pd rule that can fold right shifts and
> > > bit_field_refs of integers into just a bit_field_ref by adjusting the
> > > offset and the size of the extract and adds an extend to the previous 
> > > size.
> > >
> > > Concretely turns:
> > >
> > > #include 
> > >
> > > unsigned int foor (uint32x4_t x)
> > > {
> > > return x[1] >> 16;
> > > }
> > >
> > > which used to generate:
> > >
> > >   _1 = BIT_FIELD_REF ;
> > >   _3 = _1 >> 16;
> > >
> > > into
> > >
> > >   _4 = BIT_FIELD_REF ;
> > >   _2 = (unsigned int) _4;
> > >
> > > I currently limit the rewrite to only doing it if the resulting
> > > extract is in a mode the target supports. i.e. it won't rewrite it to
> > > extract say 13-bits because I worry that for targets that won't have a
> > > bitfield extract instruction this may be a de-optimization.
> >
> > It is only a de-optimization for the following case:
> > * vector extraction
> >
> > All other cases should be handled correctly in the middle-end when
> > expanding to RTL because they need to be handled for bit-fields anyways.
> > Plus SIGN_EXTRACT and ZERO_EXTRACT would be used in the integer case
> > for the RTL.
> > Getting SIGN_EXTRACT/ZERO_EXTRACT early on in the RTL is better than
> > waiting until combine really.
> >
>
> Fair enough, I've dropped the constraint.

Well the constraint should be done still for VECTOR_TYPE I think.
Attached is what I had done for left shift for integer types.
Note the BYTES_BIG_ENDIAN part which you missed for the right shift case.

Thanks,
Andrew Pinski

>
> >
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > > and no issues.
> > >
> > > Testcase are added in patch 2/2.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > * match.pd: Add bitfield and shift folding.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > >
> > 1d407414bee278c64c00d425d9f025c1c58d853d..b225d36dc758f1581502c8d03
> > 761
> > > 544bfd499c01 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -7245,6 +7245,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >&& ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P
> > (TREE_TYPE(@0)))
> > >(IFN_REDUC_PLUS_WIDEN @0)))
> > >
> > > +/* Canonicalize BIT_FIELD_REFS and shifts to BIT_FIELD_REFS.  */ (for
> > > +shift (rshift)
> > > + op (plus)
> > > + (simplify
> > > +  (shift (BIT_FIELD_REF @0 @1 @2) integer_pow2p@3)
> > > +  (if (INTEGRAL_TYPE_P (type))
> > > +   (with { /* Can't use wide-int here as the precision differs between
> > > + @1 and @3.  */
> > > +  unsigned HOST_WIDE_INT size = tree_to_uhwi (@1);
> > > +  unsigned HOST_WIDE_INT shiftc = tree_to_uhwi (@3);
> > > +  unsigned HOST_WIDE_INT newsize = size - shiftc;
> > > +  tree nsize = wide_int_to_tree (bitsizetype, newsize);
> > > +  tree ntype
> > > += build_nonstandard_integer_type (newsize, 1); }
> >
> > Maybe use `build_nonstandard_integer_type (newsize, /* unsignedp = */
> > true);` or better yet `build_nonstandard_integer_type (newsize,
> > UNSIGNED);`
>
> Ah, will do,
> Tamar.
>
> >
> > I had started to convert some of the unsignedp into enum signop but I never
> > finished or submitted the patch.
> >
> > Thanks,
> > Andrew Pinski
> >
> >
> > > +(if (ntype)
> > > + (convert:type (BIT_FIELD_REF:ntype @0 { nsize; } (op @2
> > > + @3
> > > +
> > >  (simplify
> > >   (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
> > >   (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4);
> > > }))
> > >
> > >
> > >
> > >
> > > --
From ed7c08c4d565bd4418cf2dce3bbfecc18fdd42a2 Mon Sep 17 00:00:00 2001
From: Andrew Pinski 
Date: Wed, 25 Dec 2019 01:20:13 +
Subject: [PATCH] Add simplification of shift of a bit_field.

We can simplify a shift of a bit_field_ref to
a shift of an and (note sometimes the shift can
be removed).

Change-Id: I1a9f3fc87889ecd7cf569272405b6ee7dd5f8d7b
Signed-off-by: Andrew Pinski 
---

diff --git a/gcc/match.pd b/gcc/match.pd
index cb981ec..e4f6d47 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6071,6 +6071,34 @@
 (cmp (bit_and @0 { wide_int_to_tree (type1, mask); })
  { wide_int_to_tree (type1, cst); })
 
+/* lshift> -> shift(bit_and(@0, mask)) */
+(simplify
+ (lshift (convert (BIT_FIELD_REF@bit @0 @bitsize @bitpos)) INTEGER_CST@1)
+ (if (INTEGRAL_TYPE_P (type)
+  && INTEGRAL_TYPE_P (TREE_TYPE (@0))
+  && 

Re: [PATCH] c++: Instantiate less when evaluating __is_convertible

2022-09-26 Thread Patrick Palka via Gcc-patches
On Mon, 26 Sep 2022, Marek Polacek via Gcc-patches wrote:

> Jon reported that evaluating __is_convertible in this test leads to
> instantiating char_traits::eq, which is invalid (because we
> are trying to call a member function on a char) and so we fail to
> compile the test.  __is_convertible doesn't and shouldn't need to
> instantiate so much, so let's limit it with a cp_unevaluated guard.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
>   PR c++/106784
> 
> gcc/cp/ChangeLog:
> 
>   * method.cc (is_convertible): Use cp_unevaluated.

I think is_nothrow_convertible would need cp_unevaluated too (or maybe we
should define is_nothrow_convertible in terms of is_convertible).

And the testcase can probably be minimized to something like:

  struct A;
  struct B { template B(const T&) noexcept { T::nonexistent; } };

  static_assert(__is_convertible(const A&, B));
  static_assert(__is_nothrow_convertible(const A&, B));

> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/ext/is_convertible3.C: New test.
> ---
>  gcc/cp/method.cc   |  1 +
>  gcc/testsuite/g++.dg/ext/is_convertible3.C | 66 ++
>  2 files changed, 67 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_convertible3.C
> 
> diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
> index c35a59fe56c..45f70f5d3f3 100644
> --- a/gcc/cp/method.cc
> +++ b/gcc/cp/method.cc
> @@ -2246,6 +2246,7 @@ is_convertible (tree from, tree to)
>  {
>if (VOID_TYPE_P (from) && VOID_TYPE_P (to))
>  return true;
> +  cp_unevaluated u;
>tree expr = build_stub_object (from);
>expr = perform_implicit_conversion (to, expr, tf_none);
>if (expr == error_mark_node)
> diff --git a/gcc/testsuite/g++.dg/ext/is_convertible3.C 
> b/gcc/testsuite/g++.dg/ext/is_convertible3.C
> new file mode 100644
> index 000..c817dc6f146
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_convertible3.C
> @@ -0,0 +1,66 @@
> +// PR c++/106784
> +// { dg-do compile { target c++11 } }
> +// Make sure we don't reject this at runtime by trying to instantiate
> +// char_traits::eq(CharT, CharT) while evaluating __is_convertible.
> +
> +template
> +struct bool_constant { static constexpr bool value = B; };
> +using true_type = bool_constant;
> +using false_type = bool_constant;
> +
> +template struct is_void : false_type { };
> +template<> struct is_void : true_type { };
> +
> +template T&& declval();
> +
> +template struct enable_if { };
> +template<> struct enable_if { using type = void; };
> +template using enable_if_t = typename enable_if::type;
> +
> +template
> +  struct is_convertible
> +  : public bool_constant<__is_convertible(_From, _To)>
> +  { };
> +
> +template
> +struct char_traits
> +{
> +  static unsigned long length(const char* s) { eq(*s, *s); return 0; }
> +
> +  static void eq(CharT l, CharT r) noexcept { l.f(r); }
> +};
> +
> +template
> +struct basic_string_view
> +{
> +  using traits_type = char_traits;
> +
> +  constexpr basic_string_view() = default;
> +  constexpr basic_string_view(const basic_string_view&) = default;
> +
> +  constexpr
> +  basic_string_view(const CharT* __str) noexcept
> +  : _M_len{traits_type::length(__str)}
> +  { }
> +
> +  unsigned long _M_len = 0;
> +};
> +
> +template
> +struct basic_string
> +{
> +  template
> +enable_if_t>::value
> +&& !is_convertible::value>
> +replace(int, T) { }
> +
> +  void replace(unsigned long, const char*) { }
> +
> +  void replace(const char* s) { replace(1, s); }
> +};
> +
> +int main()
> +{
> +  basic_string s;
> +  s.replace("");
> +}
> 
> base-commit: 2460f7cdef7ef9c971de79271afc0db73687a272
> -- 
> 2.37.3
> 
> 



Re: [PATCH] c++: Instantiate less when evaluating __is_convertible

2022-09-26 Thread Jonathan Wakely via Gcc-patches
On Mon, 26 Sept 2022 at 16:23, Marek Polacek wrote:
>
> Jon reported that evaluating __is_convertible in this test leads to
> instantiating char_traits::eq, which is invalid (because we
> are trying to call a member function on a char)

N.B. in the original code wasn't trying to do something dumb like call
a member function on a char, but it was using basic_string_view
where X is a class type without an operator== and so
char_traits::eq was invalid. I changed it to just use
basic_string_view and changed char_traits::eq to do something
different, that was invalid for char.

I can provide a less silly test case if you like, but I don't think it
matters for the purposes of the testsuite.



Re: [PATCH] c++: Instantiate less when evaluating __is_convertible

2022-09-26 Thread Jason Merrill via Gcc-patches

On 9/26/22 11:51, Patrick Palka wrote:

On Mon, 26 Sep 2022, Marek Polacek via Gcc-patches wrote:


Jon reported that evaluating __is_convertible in this test leads to
instantiating char_traits::eq, which is invalid (because we
are trying to call a member function on a char) and so we fail to
compile the test.  __is_convertible doesn't and shouldn't need to
instantiate so much, so let's limit it with a cp_unevaluated guard.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/106784

gcc/cp/ChangeLog:

* method.cc (is_convertible): Use cp_unevaluated.


I think is_nothrow_convertible would need cp_unevaluated too (or maybe we
should define is_nothrow_convertible in terms of is_convertible).


Agreed.


And the testcase can probably be minimized to something like:

   struct A;
   struct B { template B(const T&) noexcept { T::nonexistent; } };

   static_assert(__is_convertible(const A&, B));
   static_assert(__is_nothrow_convertible(const A&, B));



gcc/testsuite/ChangeLog:

* g++.dg/ext/is_convertible3.C: New test.
---
  gcc/cp/method.cc   |  1 +
  gcc/testsuite/g++.dg/ext/is_convertible3.C | 66 ++
  2 files changed, 67 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/ext/is_convertible3.C

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index c35a59fe56c..45f70f5d3f3 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -2246,6 +2246,7 @@ is_convertible (tree from, tree to)
  {
if (VOID_TYPE_P (from) && VOID_TYPE_P (to))
  return true;
+  cp_unevaluated u;
tree expr = build_stub_object (from);
expr = perform_implicit_conversion (to, expr, tf_none);
if (expr == error_mark_node)
diff --git a/gcc/testsuite/g++.dg/ext/is_convertible3.C 
b/gcc/testsuite/g++.dg/ext/is_convertible3.C
new file mode 100644
index 000..c817dc6f146
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_convertible3.C
@@ -0,0 +1,66 @@
+// PR c++/106784
+// { dg-do compile { target c++11 } }
+// Make sure we don't reject this at runtime by trying to instantiate
+// char_traits::eq(CharT, CharT) while evaluating __is_convertible.
+
+template
+struct bool_constant { static constexpr bool value = B; };
+using true_type = bool_constant;
+using false_type = bool_constant;
+
+template struct is_void : false_type { };
+template<> struct is_void : true_type { };
+
+template T&& declval();
+
+template struct enable_if { };
+template<> struct enable_if { using type = void; };
+template using enable_if_t = typename enable_if::type;
+
+template
+  struct is_convertible
+  : public bool_constant<__is_convertible(_From, _To)>
+  { };
+
+template
+struct char_traits
+{
+  static unsigned long length(const char* s) { eq(*s, *s); return 0; }
+
+  static void eq(CharT l, CharT r) noexcept { l.f(r); }
+};
+
+template
+struct basic_string_view
+{
+  using traits_type = char_traits;
+
+  constexpr basic_string_view() = default;
+  constexpr basic_string_view(const basic_string_view&) = default;
+
+  constexpr
+  basic_string_view(const CharT* __str) noexcept
+  : _M_len{traits_type::length(__str)}
+  { }
+
+  unsigned long _M_len = 0;
+};
+
+template
+struct basic_string
+{
+  template
+enable_if_t>::value
+&& !is_convertible::value>
+replace(int, T) { }
+
+  void replace(unsigned long, const char*) { }
+
+  void replace(const char* s) { replace(1, s); }
+};
+
+int main()
+{
+  basic_string s;
+  s.replace("");
+}

base-commit: 2460f7cdef7ef9c971de79271afc0db73687a272
--
2.37.3








[PATCH v2] c++: Instantiate less when evaluating __is_convertible

2022-09-26 Thread Marek Polacek via Gcc-patches
On Mon, Sep 26, 2022 at 11:51:30AM -0400, Patrick Palka wrote:
> On Mon, 26 Sep 2022, Marek Polacek via Gcc-patches wrote:
> 
> > Jon reported that evaluating __is_convertible in this test leads to
> > instantiating char_traits::eq, which is invalid (because we
> > are trying to call a member function on a char) and so we fail to
> > compile the test.  __is_convertible doesn't and shouldn't need to
> > instantiate so much, so let's limit it with a cp_unevaluated guard.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > PR c++/106784
> > 
> > gcc/cp/ChangeLog:
> > 
> > * method.cc (is_convertible): Use cp_unevaluated.
> 
> I think is_nothrow_convertible would need cp_unevaluated too (or maybe we
> should define is_nothrow_convertible in terms of is_convertible).

/facepalm, that's what I get by not using a single implementation for both!
 
> And the testcase can probably be minimized to something like:
> 
>   struct A;
>   struct B { template B(const T&) noexcept { T::nonexistent; } };
> 
>   static_assert(__is_convertible(const A&, B));
>   static_assert(__is_nothrow_convertible(const A&, B));

Adjusted, thanks.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Jon reported that evaluating __is_convertible in a test led to
instantiating something ill-formed and so we failed to compile the test.
__is_convertible doesn't and shouldn't need to instantiate so much, so
let's limit it with a cp_unevaluated guard.  Use a helper function to
implement both built-ins.

PR c++/106784

gcc/cp/ChangeLog:

* method.cc (is_convertible_helper): New.
(is_convertible): Use it.
(is_nothrow_convertible): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_convertible3.C: New test.
* g++.dg/ext/is_nothrow_convertible3.C: New test.
---
 gcc/cp/method.cc  | 23 ---
 gcc/testsuite/g++.dg/ext/is_convertible3.C|  9 
 .../g++.dg/ext/is_nothrow_convertible3.C  |  9 
 3 files changed, 33 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_convertible3.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_nothrow_convertible3.C

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index c35a59fe56c..9f917f13134 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -2236,6 +2236,19 @@ ref_xes_from_temporary (tree to, tree from, bool 
direct_init_p)
   return ref_conv_binds_directly (to, val, direct_init_p).is_false ();
 }
 
+/* Worker for is_{,nothrow_}convertible.  Attempt to perform an implicit
+   conversion from FROM to TO and return the result.  */
+
+static tree
+is_convertible_helper (tree from, tree to)
+{
+  if (VOID_TYPE_P (from) && VOID_TYPE_P (to))
+return integer_one_node;
+  cp_unevaluated u;
+  tree expr = build_stub_object (from);
+  return perform_implicit_conversion (to, expr, tf_none);
+}
+
 /* Return true if FROM can be converted to TO using implicit conversions,
or both FROM and TO are possibly cv-qualified void.  NB: This doesn't
implement the "Access checks are performed as if from a context unrelated
@@ -2244,10 +2257,7 @@ ref_xes_from_temporary (tree to, tree from, bool 
direct_init_p)
 bool
 is_convertible (tree from, tree to)
 {
-  if (VOID_TYPE_P (from) && VOID_TYPE_P (to))
-return true;
-  tree expr = build_stub_object (from);
-  expr = perform_implicit_conversion (to, expr, tf_none);
+  tree expr = is_convertible_helper (from, to);
   if (expr == error_mark_node)
 return false;
   return !!expr;
@@ -2258,10 +2268,7 @@ is_convertible (tree from, tree to)
 bool
 is_nothrow_convertible (tree from, tree to)
 {
-  if (VOID_TYPE_P (from) && VOID_TYPE_P (to))
-return true;
-  tree expr = build_stub_object (from);
-  expr = perform_implicit_conversion (to, expr, tf_none);
+  tree expr = is_convertible_helper (from, to);
   if (expr == NULL_TREE || expr == error_mark_node)
 return false;
   return expr_noexcept_p (expr, tf_none);
diff --git a/gcc/testsuite/g++.dg/ext/is_convertible3.C 
b/gcc/testsuite/g++.dg/ext/is_convertible3.C
new file mode 100644
index 000..7a986d075c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_convertible3.C
@@ -0,0 +1,9 @@
+// PR c++/106784
+// { dg-do compile { target c++11 } }
+// Make sure we don't reject this at runtime by trying to instantiate
+// something that would be ill-formed.
+
+struct A;
+struct B { template B(const T&) noexcept { T::nonexistent; } };
+
+static_assert(__is_convertible(const A&, B), "");
diff --git a/gcc/testsuite/g++.dg/ext/is_nothrow_convertible3.C 
b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible3.C
new file mode 100644
index 000..05b1e1d9ad9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible3.C
@@ -0,0 +1,9 @@
+// PR c++/106784
+// { dg-do compile { target c++11 } }
+// Make sure we don't reject this at runtime by trying to instantiate
+// something that would be ill-formed.
+
+struct A;
+struct B { templ

Re: [PATCH] c++: Instantiate less when evaluating __is_convertible

2022-09-26 Thread Marek Polacek via Gcc-patches
On Mon, Sep 26, 2022 at 05:02:36PM +0100, Jonathan Wakely wrote:
> On Mon, 26 Sept 2022 at 16:23, Marek Polacek wrote:
> >
> > Jon reported that evaluating __is_convertible in this test leads to
> > instantiating char_traits::eq, which is invalid (because we
> > are trying to call a member function on a char)
> 
> N.B. in the original code wasn't trying to do something dumb like call
> a member function on a char, but it was using basic_string_view
> where X is a class type without an operator== and so
> char_traits::eq was invalid. I changed it to just use
> basic_string_view and changed char_traits::eq to do something
> different, that was invalid for char.

Ack.

> I can provide a less silly test case if you like, but I don't think it
> matters for the purposes of the testsuite.

I think no need to, I'm going to use Patrick's short test.

Thanks,

Marek



Re: [PATCH] c++: Don't quote nothrow in diagnostic

2022-09-26 Thread Jason Merrill via Gcc-patches

On 9/26/22 03:50, Richard Biener wrote:

On Fri, Sep 23, 2022 at 8:41 PM Marek Polacek via Gcc-patches
 wrote:


In 
Jason noticed that we quote "nothrow" in diagnostics even though it's
not a keyword in C++.  Just removing the quotes didn't work because
then -Wformat-diag complains, so this patch replaces it with "no-throw".

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


That doesn't look like an improvement to me.  Can we quote 'nothrow()' instead?


nothrow() is a syntax error; the C++11 keyword is 'noexcept'. 
std::nothrow is a dummy placement argument used to indicate that a 
new-expression should return null rather than throw on failure.


But bizarrely, the library traits use the word "nothrow".  Marek's patch 
clarifies that we are not trying to refer to anything in the language.



I'd rather leave it alone than changing it to no-throw.  Why does -Wformat-diag
complain?  If we shouldn't quote nothrow that should be adjusted?


I think -Wformat-diag complains because "nothrow" is an attribute; it 
also includes some other attribute names in the list of "keywords".


I would also be fine with just removing the quotes and removing nothrow 
from c_keywords.


Jason



[PATCH] openmp: Add OpenMP assume, assumes and begin/end assumes support

2022-09-26 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch implements OpenMP 5.1
#pragma omp assume
#pragma omp assumes
and
#pragma omp begin assumes
#pragma omp end assumes
directive support for C and C++.  Currently it doesn't remember
anything from the assumption clauses for later, so is mainly
to support the directives and diagnose errors in their use.
If the recently posted C++23 [[assume (cond)]]; support makes it
in, the intent is that this can be easily adjusted at least for
the #pragma omp assume directive with holds clause(s) to use
the same infrastructure.  Now, C++23 portable assumptions are slightly
different from OpenMP 5.1 assumptions' holds clause in that C++23
assumption holds just where it appears, while OpenMP 5.1 assumptions
hold everywhere in the scope of the directive.  For assumes
directive which can appear at file or namespace scope it is the whole
TU and everything that functions from there call at runtime, for
begin assumes/end assumes pair all the functions in between those
directives and everything they call and for assume directive the
associated (currently structured) block.  I have no idea how to
represents such holds to be usable for optimizers, except to
make
#pragma omp assume holds (cond)
block;
expand essentially to
[[assume (cond)]];
block;
or
[[assume (cond)]];
block;
[[assume (cond)]];
for now.  Except for holds clause, the other assumptions are
OpenMP related, I'd say we should brainstorm where it would be
useful to optimize based on such information (I guess e.g. in target
regions it easily could) and only when we come up with something
like that think about how to propagate the assumptions to the optimizers.

Will bootstrap/regtest this tonight and commit if it passes the testing.

2022-09-26  Jakub Jelinek  

gcc/c-family/
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_ASSUME,
PRAGMA_OMP_ASSUMES and PRAGMA_OMP_BEGIN.  Rename
PRAGMA_OMP_END_DECLARE_TARGET to PRAGMA_OMP_END.
* c-pragma.cc (omp_pragmas): Add assumes and begin.
For end rename PRAGMA_OMP_END_DECLARE_TARGET to PRAGMA_OMP_END.
(omp_pragmas_simd): Add assume.
* c-common.h (c_omp_directives): Declare.
* c-omp.cc (omp_directives): Rename to ...
(c_omp_directives): ... this.  No longer static.  Uncomment
assume, assumes, begin assumes and end assumes entries.
In end declare target entry rename PRAGMA_OMP_END_DECLARE_TARGET
to PRAGMA_OMP_END.
(c_omp_categorize_directive): Adjust for omp_directives to
c_omp_directives renaming.
gcc/c/
* c-lang.h (current_omp_begin_assumes): Declare.
* c-parser.cc: Include bitmap.h.
(c_parser_omp_end_declare_target): Rename to ...
(c_parser_omp_end): ... this.  Handle also end assumes.
(c_parser_omp_begin, c_parser_omp_assumption_clauses,
c_parser_omp_assumes, c_parser_omp_assume): New functions.
(c_parser_translation_unit): Also diagnose #pragma omp begin assumes
without corresponding #pragma omp end assumes.
(c_parser_pragma): Use %s in may only be used at file scope
diagnostics to decrease number of translatable messages.  Handle
PRAGMA_OMP_BEGIN and PRAGMA_OMP_ASSUMES.  Handle PRAGMA_OMP_END
rather than PRAGMA_OMP_END_DECLARE_TARGET and call c_parser_omp_end
for it rather than c_parser_omp_end_declare_target.
(c_parser_omp_construct): Handle PRAGMA_OMP_ASSUME.
* c-decl.cc (current_omp_begin_assumes): Define.
gcc/cp/
* cp-tree.h (struct omp_begin_assumes_data): New type.
(struct saved_scope): Add omp_begin_assumes member.
* parser.cc: Include bitmap.h.
(cp_parser_omp_assumption_clauses, cp_parser_omp_assume,
cp_parser_omp_assumes, cp_parser_omp_begin): New functions.
(cp_parser_omp_end_declare_target): Rename to ...
(cp_parser_omp_end): ... this.  Handle also end assumes.
(cp_parser_omp_construct): Handle PRAGMA_OMP_ASSUME.
(cp_parser_pragma): Handle PRAGMA_OMP_ASSUME, PRAGMA_OMP_ASSUMES
and PRAGMA_OMP_BEGIN.  Handle PRAGMA_OMP_END rather than
PRAGMA_OMP_END_DECLARE_TARGET and call cp_parser_omp_end
for it rather than cp_parser_omp_end_declare_target.
* pt.cc (apply_late_template_attributes): Also temporarily clear
omp_begin_assumes.
* semantics.cc (finish_translation_unit): Also diagnose
#pragma omp begin assumes without corresponding
#pragma omp end assumes.
gcc/testsuite/
* c-c++-common/gomp/assume-1.c: New test.
* c-c++-common/gomp/assume-2.c: New test.
* c-c++-common/gomp/assume-3.c: New test.
* c-c++-common/gomp/assumes-1.c: New test.
* c-c++-common/gomp/assumes-2.c: New test.
* c-c++-common/gomp/assumes-3.c: New test.
* c-c++-common/gomp/assumes-4.c: New test.
* c-c++-common/gomp/begin-assumes-1.c: New test.
* c-c++-common/gomp/begin-assumes-2.c: New test.
* c-

Re: [PATCH v2] c++: Instantiate less when evaluating __is_convertible

2022-09-26 Thread Jason Merrill via Gcc-patches

On 9/26/22 12:25, Marek Polacek wrote:

On Mon, Sep 26, 2022 at 11:51:30AM -0400, Patrick Palka wrote:

On Mon, 26 Sep 2022, Marek Polacek via Gcc-patches wrote:


Jon reported that evaluating __is_convertible in this test leads to
instantiating char_traits::eq, which is invalid (because we
are trying to call a member function on a char) and so we fail to
compile the test.  __is_convertible doesn't and shouldn't need to
instantiate so much, so let's limit it with a cp_unevaluated guard.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/106784

gcc/cp/ChangeLog:

* method.cc (is_convertible): Use cp_unevaluated.


I think is_nothrow_convertible would need cp_unevaluated too (or maybe we
should define is_nothrow_convertible in terms of is_convertible).


/facepalm, that's what I get by not using a single implementation for both!
  

And the testcase can probably be minimized to something like:

   struct A;
   struct B { template B(const T&) noexcept { T::nonexistent; } };

   static_assert(__is_convertible(const A&, B));
   static_assert(__is_nothrow_convertible(const A&, B));


Adjusted, thanks.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
Jon reported that evaluating __is_convertible in a test led to
instantiating something ill-formed and so we failed to compile the test.
__is_convertible doesn't and shouldn't need to instantiate so much, so
let's limit it with a cp_unevaluated guard.  Use a helper function to
implement both built-ins.

PR c++/106784

gcc/cp/ChangeLog:

* method.cc (is_convertible_helper): New.
(is_convertible): Use it.
(is_nothrow_convertible): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_convertible3.C: New test.
* g++.dg/ext/is_nothrow_convertible3.C: New test.
---
  gcc/cp/method.cc  | 23 ---
  gcc/testsuite/g++.dg/ext/is_convertible3.C|  9 
  .../g++.dg/ext/is_nothrow_convertible3.C  |  9 
  3 files changed, 33 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/ext/is_convertible3.C
  create mode 100644 gcc/testsuite/g++.dg/ext/is_nothrow_convertible3.C

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index c35a59fe56c..9f917f13134 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -2236,6 +2236,19 @@ ref_xes_from_temporary (tree to, tree from, bool 
direct_init_p)
return ref_conv_binds_directly (to, val, direct_init_p).is_false ();
  }
  
+/* Worker for is_{,nothrow_}convertible.  Attempt to perform an implicit

+   conversion from FROM to TO and return the result.  */
+
+static tree
+is_convertible_helper (tree from, tree to)
+{
+  if (VOID_TYPE_P (from) && VOID_TYPE_P (to))
+return integer_one_node;
+  cp_unevaluated u;
+  tree expr = build_stub_object (from);
+  return perform_implicit_conversion (to, expr, tf_none);
+}
+
  /* Return true if FROM can be converted to TO using implicit conversions,
 or both FROM and TO are possibly cv-qualified void.  NB: This doesn't
 implement the "Access checks are performed as if from a context unrelated
@@ -2244,10 +2257,7 @@ ref_xes_from_temporary (tree to, tree from, bool 
direct_init_p)
  bool
  is_convertible (tree from, tree to)
  {
-  if (VOID_TYPE_P (from) && VOID_TYPE_P (to))
-return true;
-  tree expr = build_stub_object (from);
-  expr = perform_implicit_conversion (to, expr, tf_none);
+  tree expr = is_convertible_helper (from, to);
if (expr == error_mark_node)
  return false;
return !!expr;
@@ -2258,10 +2268,7 @@ is_convertible (tree from, tree to)
  bool
  is_nothrow_convertible (tree from, tree to)
  {
-  if (VOID_TYPE_P (from) && VOID_TYPE_P (to))
-return true;
-  tree expr = build_stub_object (from);
-  expr = perform_implicit_conversion (to, expr, tf_none);
+  tree expr = is_convertible_helper (from, to);
if (expr == NULL_TREE || expr == error_mark_node)
  return false;
return expr_noexcept_p (expr, tf_none);
diff --git a/gcc/testsuite/g++.dg/ext/is_convertible3.C 
b/gcc/testsuite/g++.dg/ext/is_convertible3.C
new file mode 100644
index 000..7a986d075c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_convertible3.C
@@ -0,0 +1,9 @@
+// PR c++/106784
+// { dg-do compile { target c++11 } }
+// Make sure we don't reject this at runtime by trying to instantiate
+// something that would be ill-formed.
+
+struct A;
+struct B { template B(const T&) noexcept { T::nonexistent; } };
+
+static_assert(__is_convertible(const A&, B), "");
diff --git a/gcc/testsuite/g++.dg/ext/is_nothrow_convertible3.C 
b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible3.C
new file mode 100644
index 000..05b1e1d9ad9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible3.C
@@ -0,0 +1,9 @@
+// PR c++/106784
+// { dg-do compile { target c++11 } }
+// Make sure we don't reject this at runtime by trying to instantiate
+// something that would be ill-formed.
+
+struct A;
+

Re: [PATCH] c++: P2513R4, char8_t Compatibility and Portability Fix [PR106656]

2022-09-26 Thread Jason Merrill via Gcc-patches

On 9/23/22 21:16, Marek Polacek wrote:

P0482R6, which added char8_t, didn't allow

   const char arr[] = u8"howdy";

because it said "Declarations of arrays of char may currently be initialized
with UTF-8 string literals. Under this proposal, such initializations would
become ill-formed."  This caused too many issues, so P2513R4 alleviates some
of those compatibility problems.  In particular, "Arrays of char or unsigned
char may now be initialized with a UTF-8 string literal."  This restriction
has been lifted for initialization only, not implicit conversions.  Also,
my reading is that 'signed char' was excluded from the allowable conversions.

This is supposed to be treated as a DR in C++20.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


PR c++/106656

gcc/c-family/ChangeLog:

* c-cppbuiltin.cc (c_cpp_builtins): Update value of __cpp_char8_t
for C++20.

gcc/cp/ChangeLog:

* typeck2.cc (array_string_literal_compatible_p): Allow
initializing arrays of char or unsigned char by a UTF-8 string literal.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/feat-cxx2b.C: Adjust.
* g++.dg/cpp2a/feat-cxx2a.C: Likewise.
* g++.dg/ext/char8_t-feature-test-macro-2.C: Likewise.
* g++.dg/ext/char8_t-init-2.C: Likewise.
* g++.dg/cpp2a/char8_t3.C: New test.
* g++.dg/cpp2a/char8_t4.C: New test.
---
  gcc/c-family/c-cppbuiltin.cc  |  2 +-
  gcc/cp/typeck2.cc |  9 +
  gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C   |  4 +-
  gcc/testsuite/g++.dg/cpp2a/char8_t3.C | 37 +++
  gcc/testsuite/g++.dg/cpp2a/char8_t4.C | 17 +
  gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C   |  4 +-
  .../g++.dg/ext/char8_t-feature-test-macro-2.C |  4 +-
  gcc/testsuite/g++.dg/ext/char8_t-init-2.C |  4 +-
  8 files changed, 72 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/char8_t3.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/char8_t4.C

diff --git a/gcc/c-family/c-cppbuiltin.cc b/gcc/c-family/c-cppbuiltin.cc
index a1557eb23d5..b709f845c81 100644
--- a/gcc/c-family/c-cppbuiltin.cc
+++ b/gcc/c-family/c-cppbuiltin.cc
@@ -1112,7 +1112,7 @@ c_cpp_builtins (cpp_reader *pfile)
if (flag_threadsafe_statics)
cpp_define (pfile, "__cpp_threadsafe_static_init=200806L");
if (flag_char8_t)
-cpp_define (pfile, "__cpp_char8_t=201811L");
+   cpp_define (pfile, "__cpp_char8_t=202207L");
  #ifndef THREAD_MODEL_SPEC
/* Targets that define THREAD_MODEL_SPEC need to define
 __STDCPP_THREADS__ in their config/XXX/XXX-c.c themselves.  */
diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 75fd0e2a9bf..739097a9734 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -1118,6 +1118,15 @@ array_string_literal_compatible_p (tree type, tree init)
if (ordinary_char_type_p (to_char_type)
&& ordinary_char_type_p (from_char_type))
  return true;
+
+  /* P2513 (C++20/C++23): "an array of char or unsigned char may
+ be initialized by a UTF-8 string literal, or by such a string
+ literal enclosed in braces."  */
+  if (from_char_type == char8_type_node
+  && (to_char_type == char_type_node
+ || to_char_type == unsigned_char_type_node))
+return true;
+
return false;
  }
  
diff --git a/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C b/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C

index d3e40724085..0537e1d24b5 100644
--- a/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
+++ b/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
@@ -504,8 +504,8 @@
  
  #ifndef __cpp_char8_t

  #  error "__cpp_char8_t"
-#elif __cpp_char8_t != 201811
-#  error "__cpp_char8_t != 201811"
+#elif __cpp_char8_t != 202207
+#  error "__cpp_char8_t != 202207"
  #endif
  
  #ifndef __cpp_designated_initializers

diff --git a/gcc/testsuite/g++.dg/cpp2a/char8_t3.C 
b/gcc/testsuite/g++.dg/cpp2a/char8_t3.C
new file mode 100644
index 000..071a718c4d0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/char8_t3.C
@@ -0,0 +1,37 @@
+// PR c++/106656 - P2513 - char8_t Compatibility and Portability Fixes
+// { dg-do compile { target c++20 } }
+
+const char *p1 = u8""; // { dg-error "invalid conversion" }
+const unsigned char *p2 = u8""; // { dg-error "invalid conversion" }
+const signed char *p3 = u8""; // { dg-error "invalid conversion" }
+const char *p4 = { u8"" }; // { dg-error "invalid conversion" }
+const unsigned char *p5 = { u8"" }; // { dg-error "invalid conversion" }
+const signed char *p6 = { u8"" }; // { dg-error "invalid conversion" }
+const char *p7 = static_cast(u8""); // { dg-error "invalid" }
+const char a1[] = u8"text";
+const unsigned char a2[] = u8"";
+const signed char a3[] = u8""; // { dg-error "cannot initialize array" }
+const char a4[] = { u8"text" };
+const unsigned char a5[] = { u8"" };
+const signed char a6[] = { u8"" }; // { dg-error "cannot initialize array" }
+
+const char *
+resource_id ()
+{
+  stat

[COMMITTED] Optimize [0 = x & MASK] in range-ops.

2022-09-26 Thread Aldy Hernandez via Gcc-patches
For [0 = x & MASK], we can determine that x is ~MASK.  This is
something we're picking up in DOM thanks to maybe_set_nonzero_bits,
but is something we should handle natively.

This is a good example of how much easier to maintain the range-ops
entries are versus the ad-hoc pattern matching stuff we had to do
before.  For the curious, compare the changes to range-op here,
versus maybe_set_nonzero_bits.

I'm leaving the call to maybe_set_nonzero_bits until I can properly
audit it to make sure we're catching it all in range-ops.  It won't
hurt, since both set_range_info() and set_nonzero_bits() are
intersect operations, so we'll never lose information if we do both.

Tested on x86-64 Linux.

PR tree-optimization/107009

gcc/ChangeLog:

* range-op.cc (operator_bitwise_and::op1_range): Optimize 0 = x & MASK.
(range_op_bitwise_and_tests): New test.
---
 gcc/range-op.cc | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 072ebd32109..fc930f4d613 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -2951,6 +2951,15 @@ operator_bitwise_and::op1_range (irange &r, tree type,
 }
   if (r.undefined_p ())
 set_nonzero_range_from_mask (r, type, lhs);
+
+  // For 0 = op1 & MASK, op1 is ~MASK.
+  if (lhs.zero_p () && op2.singleton_p ())
+{
+  wide_int nz = wi::bit_not (op2.get_nonzero_bits ());
+  int_range<2> tmp (type);
+  tmp.set_nonzero_bits (nz);
+  r.intersect (tmp);
+}
   return true;
 }
 
@@ -4612,6 +4621,15 @@ range_op_bitwise_and_tests ()
   op_bitwise_and.op1_range (res, integer_type_node, i1, i2);
   ASSERT_TRUE (res == int_range<1> (integer_type_node));
 
+  // For 0 = x & MASK, x is ~MASK.
+  {
+int_range<2> zero (integer_zero_node, integer_zero_node);
+int_range<2> mask = int_range<2> (INT (7), INT (7));
+op_bitwise_and.op1_range (res, integer_type_node, zero, mask);
+wide_int inv = wi::shwi (~7U, TYPE_PRECISION (integer_type_node));
+ASSERT_TRUE (res.get_nonzero_bits () == inv);
+  }
+
   // (NONZERO | X) is nonzero.
   i1.set_nonzero (integer_type_node);
   i2.set_varying (integer_type_node);
-- 
2.37.1



Re: [PATCH v2] c++: Implement C++23 P2266R1, Simpler implicit move [PR101165]

2022-09-26 Thread Jason Merrill via Gcc-patches

On 9/20/22 14:19, Marek Polacek wrote:

On Tue, Sep 06, 2022 at 10:38:12PM -0400, Jason Merrill wrote:

On 9/3/22 12:42, Marek Polacek wrote:

This patch implements https://wg21.link/p2266, which, once again,
changes the implicit move rules.  Here's a brief summary of various
changes in this area:

r125211: Introduced moving from certain lvalues when returning them
r171071: CWG 1148, enable move from value parameter on return
r212099: CWG 1579, it's OK to call a converting ctor taking an rvalue
r251035: CWG 1579, do maybe-rvalue overload resolution twice
r11-2411: Avoid calling const copy ctor on implicit move
r11-2412: C++20 implicit move changes, remove the fallback overload
resolution, allow move on throw of parameters and implicit
  move of rvalue references

P2266 enables the implicit move for functions that return references.  This
was a one-line change: check TYPE_REF_P.  That is, we will now perform
a move in

X&& foo (X&& x) {
  return x;
}

P2266 also removes the fallback overload resolution, but this was
resolved by r11-2412: we only do convert_for_initialization with
LOOKUP_PREFER_RVALUE in C++17 and older.


I wonder if we want to extend the current C++20 handling to the older modes
for GCC 13?  Not in this patch, but as a followup.


Yes, I think that would be very nice if we removed that code.
  

P2266 also says that a returned move-eligible id-expression is always an
xvalue.  This required some further short, but nontrivial changes,
especially when it comes to deduction, because we have to pay attention
to whether we have auto, auto&& (which is like T&&), or decltype(auto)
with (un)parenthesized argument.  In C++23,

decltype(auto) f(int&& x) { return (x); }
auto&& f(int x) { return x; }

both should deduce to 'int&&' but

decltype(auto) f(int x) { return x; }

should deduce to 'int'.  A cornucopia of tests attached.  I've also
verified that we behave like clang++.

xvalue_p seemed to be broken: since the introduction of clk_implicit_rval,
it cannot use '==' when checking for clk_rvalueref.

Since this change breaks code, it's only enabled in C++23.  In
particular, this code will not compile in C++23:

int& g(int&& x) { return x; }


Nice that the C++20 compatibility is so simple!


because x is now treated as an rvalue, and you can't bind a non-const lvalue
reference to an rvalue.

There's one FIXME in elision1.C:five, which we should compile but reject
with "passing 'Mutt' as 'this' argument discards qualifiers".  That
looks bogus to me, I think I'll open a PR for it.


Let's fix that now, I think.


OK, copypasting this bit from the other email so that we can have one
thread:


Can of worms.   The test is

struct Mutt {
operator int*() &&;
};

int* five(Mutt x) {
return x;  // OK since C++20 because P1155
}

'x' should be treated as an rvalue, therefore the operator fn taking
an rvalue ref to Mutt should be used to convert 'x' to int*.  We fail
because we don't treat 'x' as an rvalue because the function doesn't
return a class.  So the patch should be just

--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -10875,10 +10875,7 @@ check_return_expr (tree retval, bool *no_warning)
Note that these conditions are similar to, but not as strict as,
   the conditions for the named return value optimization.  */
 bool converted = false;
-  tree moved;
-  /* This is only interesting for class type.  */
-  if (CLASS_TYPE_P (functype)
- && (moved = treat_lvalue_as_rvalue_p (retval, /*return*/true)))
+  if (tree moved = treat_lvalue_as_rvalue_p (retval, /*return*/true))
  {
if (cxx_dialect < cxx20)
  {

which fixes the test, but breaks a lot of middle-end warnings.  For instance
g++.dg/warn/nonnull3.C, where the patch above changes .gimple:

   bool A::foo (struct A * const this, <<< Unknown tree: offset_type >>> p)
   {
-  bool D.2146;
+  bool D.2150;
   
-  D.2146 = p != -1;

-  return D.2146;
+  p.0_1 = p;
+  D.2150 = p.0_1 != -1;
+  return D.2150;
   }

and we no longer get the warning.  I thought maybe I could undo the implicit
rvalue conversion in cp_fold, when it sees implicit_rvalue_p, but that didn't
work.  So currently I'm stuck.  Should we try to figure this out or push aside?



Can you undo the implicit rvalue conversion within check_return_expr,
where we can still refer back to the original expression?


Unfortunately no, one problem is that treat_lvalue_as_rvalue_p modifies
the underlying decl by setting TREE_ADDRESSABLE, which then presumably
breaks warnings.  That is, treat_ can get 'VCE(x)' and produce
'*NLE<(X&) &x>' where 'x' flags have been modified, since we're taking
x's address.


Or avoid the rvalue conversion if the return type is scalar?


I wish :(.  In the 'five' example above, the return type is a pointer,
a scalar, but we have to convert to rvalue.


OK, then when both the return type and the type of the return value are 
scalar

Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

2022-09-26 Thread Alexander Monakov via Gcc-patches


Hi.

My main concerns remain not addressed:

1) what I said in the opening paragraphs of my previous email;

2) device-issued atomics are not guaranteed to appear atomic to the host
unless using atom.sys and translating for CUDA compute capability 6.0+.

Item 2 is a correctness issue. Item 1 I think is a matter of policy that
is up to you to hash out with Jakub.

On Mon, 26 Sep 2022, Tobias Burnus wrote:

> In theory, compiling with "-m32 -foffload-options=-m64" or "-m32
> -foffload-options=-m32" or "-m64 -foffload-options=-m32" is supported.

I have no words.

Alexander


Re: [PATCH] c++ modules: ICE with class NTTP argument [PR100616]

2022-09-26 Thread Patrick Palka via Gcc-patches
On Mon, 26 Sep 2022, Nathan Sidwell wrote:

> On 9/26/22 10:08, Nathan Sidwell wrote:
> > On 9/23/22 09:32, Patrick Palka wrote:
> > 
> > > Judging by the two commits that introduced/modified this part of
> > > maybe_register_incomplete_var, r196852 and r214333, ISTM the code
> > > is really only concerned with constexpr static data members (whose
> > > initializer may contain a pointer-to-member for a currently open class).
> > > So maybe we ought to restrict the branch like so, which effectively
> > > disables this part of maybe_register_incomplete_var during stream-in, and
> > > guarantees that outermost_open_class doesn't return NULL if the branch is
> > > taken?
> > 
> > I think the problem is that we're streaming these VAR_DECLs as regular
> > VAR_DECLS, when we should be handling them as a new kind of object fished
> > out from the template they're instantiating. (I'm guessing that'll just be a
> > new tag, a type and an initializer?)
> > 
> > Then on stream-in we can handle them in the same way as a non-modules
> > compilation handles such redeclarations.  I.e. how does:
> > 
> > template struct C { };
> > struct A { };
> > C c1; // #1
> > C c2; // #2
> > 
> > work.  Presumably at some point #2's A{} gets unified such that we find the
> > instantation that occurred at #1?

This works because the lookup in get_template_parm_object for #2's A{}
finds and reuses the VAR_DECL created for #1's A{}.

But IIUC this lookup (performed via get_global_binding) isn't
import-aware, which I suppose explains why we don't find the VAR_DECL
from another TU.

> > 
> > I notice the template arg for C is a var decl mangled as _ZTAXtl1AEE,
> > which is a 'template paramete object for A{}'.  I see that's a special
> > mangler 'mangle_template_parm_object', called from
> > get_template_parm_object.  Perhaps these VAR_DECLs need an additional
> > in-tree flag that the streamer can check for?
> 
> I wonder if we're setting the module attachment for these variables sanely?
> They should be attached to the global module.  My guess is the
> pushdecl_top_level_and_finish call in get_templatE_parm_object is not doing
> what is needed (as well as the other issues).

This is a bit of a shot in the dark, but the following seems to work:
when pushing the VAR_DECL, we need to call set_originating_module to
attach it to the global module, and when looking it up, we need to do so
in an import-aware way.  Hopefully something like this is sufficient
to properly handle these VAR_DECLs and we don't need to stream them
specially?

-- >8 --

Subject: [PATCH] c++ modules: ICE with class NTTP argument [PR100616]

The function get_template_parm_object returns an artifical mangled
VAR_DECL that's unique to the given class NTTP argument.  To enforce
this uniqueness, we first look up the mangled name of the VAR_DECL from
the global scope via get_global_binding, and only create/push a VAR_DECL
if this lookup fails.

But with modules, we need to do more to enforce uniqueness: the VAR_DECL
needs to be attached to the global module, and the lookup needs to be
import-aware, which get_global_binding currently isn't.

So this patch makes us call set_originating_module from
get_template_parm_object before pushing, and makes get_namespace_binding
use name_lookup::search_qualified which does look into imports.

It turns out this change to get_namespace_binding also fixes PR102576
where we were failing to find an imported std::initializer_list due to
the function not being import-aware.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/100616
PR c++/102576

gcc/cp/ChangeLog:

* name-lookup.cc (get_namespace_binding): Rewrite in terms of
name_lookup::search_unqualified.
* pt.cc (get_template_parm_object): Call set_originating_module
before pushing the VAR_DECL.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr100616_a.C: New test.
* g++.dg/modules/pr100616_b.C: New test.
* g++.dg/modules/pr102576_a.H: New test.
* g++.dg/modules/pr102576_b.C: New test.
---
 gcc/cp/name-lookup.cc | 21 ++---
 gcc/cp/pt.cc  |  1 +
 gcc/testsuite/g++.dg/modules/pr100616_a.C |  8 
 gcc/testsuite/g++.dg/modules/pr100616_b.C |  8 
 gcc/testsuite/g++.dg/modules/pr102576_a.H |  5 +
 gcc/testsuite/g++.dg/modules/pr102576_b.C |  9 +
 6 files changed, 37 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/pr100616_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/pr100616_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/pr102576_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/pr102576_b.C

diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index 69d555ddf1f..6cd73276cf5 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -5772,9 +5772,7 @@ do_class_using_decl (tree scope, tree name)
 
 
 /* Return the binding for NAME in 

Re: [PATCH] c++ modules: ICE with class NTTP argument [PR100616]

2022-09-26 Thread Patrick Palka via Gcc-patches
On Mon, 26 Sep 2022, Patrick Palka wrote:

> On Mon, 26 Sep 2022, Nathan Sidwell wrote:
> 
> > On 9/26/22 10:08, Nathan Sidwell wrote:
> > > On 9/23/22 09:32, Patrick Palka wrote:
> > > 
> > > > Judging by the two commits that introduced/modified this part of
> > > > maybe_register_incomplete_var, r196852 and r214333, ISTM the code
> > > > is really only concerned with constexpr static data members (whose
> > > > initializer may contain a pointer-to-member for a currently open class).
> > > > So maybe we ought to restrict the branch like so, which effectively
> > > > disables this part of maybe_register_incomplete_var during stream-in, 
> > > > and
> > > > guarantees that outermost_open_class doesn't return NULL if the branch 
> > > > is
> > > > taken?
> > > 
> > > I think the problem is that we're streaming these VAR_DECLs as regular
> > > VAR_DECLS, when we should be handling them as a new kind of object fished
> > > out from the template they're instantiating. (I'm guessing that'll just 
> > > be a
> > > new tag, a type and an initializer?)
> > > 
> > > Then on stream-in we can handle them in the same way as a non-modules
> > > compilation handles such redeclarations.  I.e. how does:
> > > 
> > > template struct C { };
> > > struct A { };
> > > C c1; // #1
> > > C c2; // #2
> > > 
> > > work.  Presumably at some point #2's A{} gets unified such that we find 
> > > the
> > > instantation that occurred at #1?
> 
> This works because the lookup in get_template_parm_object for #2's A{}
> finds and reuses the VAR_DECL created for #1's A{}.
> 
> But IIUC this lookup (performed via get_global_binding) isn't
> import-aware, which I suppose explains why we don't find the VAR_DECL
> from another TU.
> 
> > > 
> > > I notice the template arg for C is a var decl mangled as _ZTAXtl1AEE,
> > > which is a 'template paramete object for A{}'.  I see that's a special
> > > mangler 'mangle_template_parm_object', called from
> > > get_template_parm_object.  Perhaps these VAR_DECLs need an additional
> > > in-tree flag that the streamer can check for?
> > 
> > I wonder if we're setting the module attachment for these variables sanely?
> > They should be attached to the global module.  My guess is the
> > pushdecl_top_level_and_finish call in get_templatE_parm_object is not doing
> > what is needed (as well as the other issues).
> 
> This is a bit of a shot in the dark, but the following seems to work:
> when pushing the VAR_DECL, we need to call set_originating_module to
> attach it to the global module, and when looking it up, we need to do so
> in an import-aware way.  Hopefully something like this is sufficient
> to properly handle these VAR_DECLs and we don't need to stream them
> specially?

Err, rather than changing the behavior of get_namespace_binding (which
has many unrelated callers), I guess we could just use the already
import-aware lookup_qualified_name instead where appropriate.  WDYT of
the following? (testing in progress)

-- >8 --

Subject: [PATCH] c++ modules: ICE with class NTTP argument [PR100616]

PR c++/100616
PR c++/102576

gcc/cp/ChangeLog:

* pt.cc (get_template_parm_object): Use lookup_qualified_name
instead of get_global_binding.  Call set_originating_module before
pushing the VAR_DECL.
(listify): Use lookup_qualified_name instead of get_global_binding.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr100616_a.C: New test.
* g++.dg/modules/pr100616_b.C: New test.
* g++.dg/modules/pr102576_a.H: New test.
* g++.dg/modules/pr102576_b.C: New test.
---
 gcc/cp/pt.cc  | 10 ++
 gcc/testsuite/g++.dg/modules/pr100616_a.C |  8 
 gcc/testsuite/g++.dg/modules/pr100616_b.C |  8 
 gcc/testsuite/g++.dg/modules/pr102576_a.H |  5 +
 gcc/testsuite/g++.dg/modules/pr102576_b.C |  9 +
 5 files changed, 36 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/pr100616_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/pr100616_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/pr102576_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/pr102576_b.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 1f088fe281e..e030d9db2f6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -7284,8 +7284,8 @@ get_template_parm_object (tree expr, tsubst_flags_t 
complain)
   gcc_assert (!TREE_HAS_CONSTRUCTOR (expr));
 
   tree name = mangle_template_parm_object (expr);
-  tree decl = get_global_binding (name);
-  if (decl)
+  tree decl = lookup_qualified_name (global_namespace, name);
+  if (decl != error_mark_node)
 return decl;
 
   tree type = cp_build_qualified_type (TREE_TYPE (expr), TYPE_QUAL_CONST);
@@ -7307,6 +7307,7 @@ get_template_parm_object (tree expr, tsubst_flags_t 
complain)
   hash_map_safe_put (tparm_obj_values, decl, copy);
 }
 
+  set_originating_module (decl);
   pushdecl_top_level_and_finish (decl, expr);
 
   return decl;
@@ -29150,9 +29151,

[PATCH][pushed] docs: add missing dash in option name

2022-09-26 Thread Martin Liška
Pushed as obvious.

Martin

gcc/ChangeLog:

* doc/invoke.texi: Add missing dash for
  Wanalyzer-exposure-through-uninit-copy.
---
 gcc/doc/invoke.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d2e4abd3484..383d22a4bf4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9901,7 +9901,7 @@ security-sensitive value is written to an output file
 
 See @uref{https://cwe.mitre.org/data/definitions/532.html, CWE-532: 
Information Exposure Through Log Files}.
 
-@item Wanalyzer-exposure-through-uninit-copy
+@item -Wanalyzer-exposure-through-uninit-copy
 @opindex Wanalyzer-exposure-through-uninit-copy
 @opindex Wno-analyzer-exposure-through-uninit-copy
 This warning requires both @option{-fanalyzer} and the use of a plugin
-- 
2.37.3



Re: Extend fold_vec_perm to fold VEC_PERM_EXPR in VLA manner

2022-09-26 Thread Prathamesh Kulkarni via Gcc-patches
On Fri, 23 Sept 2022 at 21:33, Richard Sandiford
 wrote:
>
> Prathamesh Kulkarni  writes:
> > On Tue, 20 Sept 2022 at 18:09, Richard Sandiford
> >  wrote:
> >>
> >> Prathamesh Kulkarni  writes:
> >> > On Mon, 12 Sept 2022 at 19:57, Richard Sandiford
> >> >  wrote:
> >> >>
> >> >> Prathamesh Kulkarni  writes:
> >> >> >> The VLA encoding encodes the first N patterns explicitly.  The
> >> >> >> npatterns/nelts_per_pattern values then describe how to extend that
> >> >> >> initial sequence to an arbitrary number of elements.  So when 
> >> >> >> performing
> >> >> >> an operation on (potentially) variable-length vectors, the questions 
> >> >> >> is:
> >> >> >>
> >> >> >> * Can we work out an initial sequence and npatterns/nelts_per_pattern
> >> >> >>   pair that will be correct for all elements of the result?
> >> >> >>
> >> >> >> This depends on the operation that we're performing.  E.g. it's
> >> >> >> different for unary operations (vector_builder::new_unary_operation)
> >> >> >> and binary operations (vector_builder::new_binary_operations).  It 
> >> >> >> also
> >> >> >> varies between unary operations and between binary operations, hence
> >> >> >> the allow_stepped_p parameters.
> >> >> >>
> >> >> >> For VEC_PERM_EXPR, I think the key requirement is that:
> >> >> >>
> >> >> >> (R) Each individual selector pattern must always select from the 
> >> >> >> same vector.
> >> >> >>
> >> >> >> Whether this condition is met depends both on the pattern itself and 
> >> >> >> on
> >> >> >> the number of patterns that it's combined with.
> >> >> >>
> >> >> >> E.g. suppose we had the selector pattern:
> >> >> >>
> >> >> >>   { 0, 1, 4, ... }   i.e. 3x - 2 for x > 0
> >> >> >>
> >> >> >> If the arguments and selector are n elements then this pattern on its
> >> >> >> own would select from more than one argument if 3(n-1) - 2 >= n.
> >> >> >> This is clearly true for large enough n.  So if n is variable then
> >> >> >> we cannot represent this.
> >> >> >>
> >> >> >> If the pattern above is one of two patterns, so interleaved as:
> >> >> >>
> >> >> >>  { 0, _, 1, _, 4, _, ... }  o=0
> >> >> >>   or { _, 0, _, 1, _, 4, ... }  o=1
> >> >> >>
> >> >> >> then the pattern would select from more than one argument if
> >> >> >> 3(n/2-1) - 2 + o >= n.  This too would be a problem for variable n.
> >> >> >>
> >> >> >> But if the pattern above is one of four patterns then it selects
> >> >> >> from more than one argument if 3(n/4-1) - 2 + o >= n.  This is not
> >> >> >> true for any valid n or o, so the pattern is OK.
> >> >> >>
> >> >> >> So let's define some ad hoc terminology:
> >> >> >>
> >> >> >> * Px is the number of patterns in x
> >> >> >> * Ex is the number of elements per pattern in x
> >> >> >>
> >> >> >> where x can be:
> >> >> >>
> >> >> >> * 1: first argument
> >> >> >> * 2: second argument
> >> >> >> * s: selector
> >> >> >> * r: result
> >> >> >>
> >> >> >> Then:
> >> >> >>
> >> >> >> (1) The number of elements encoded explicitly for x is Ex*Px
> >> >> >>
> >> >> >> (2) The explicit encoding can be used to produce a sequence of 
> >> >> >> N*Ex*Px
> >> >> >> elements for any integer N.  This extended sequence can be 
> >> >> >> reencoded
> >> >> >> as having N*Px patterns, with Ex staying the same.
> >> >> >>
> >> >> >> (3) If Ex < 3, Ex can be increased by 1 by repeating the final Px 
> >> >> >> elements
> >> >> >> of the explicit encoding.
> >> >> >>
> >> >> >> So let's assume (optimistically) that we can produce the result
> >> >> >> by calculating the first Pr*Er elements and using the Pr,Er encoding
> >> >> >> to imply the rest.  Then:
> >> >> >>
> >> >> >> * (2) means that, when combining multiple input operands with 
> >> >> >> potentially
> >> >> >>   different encodings, we can set the number of patterns in the 
> >> >> >> result
> >> >> >>   to the least common multiple of the number of patterns in the 
> >> >> >> inputs.
> >> >> >>   In this case:
> >> >> >>
> >> >> >>   Pr = least_common_multiple(P1, P2, Ps)
> >> >> >>
> >> >> >>   is a valid number of patterns.
> >> >> >>
> >> >> >> * (3) means that the number of elements per pattern of the result can
> >> >> >>   be the maximum of the number of elements per pattern in the inputs.
> >> >> >>   (Alternatively, we could always use 3.)  In this case:
> >> >> >>
> >> >> >>   Er = max(E1, E2, Es)
> >> >> >>
> >> >> >>   is a valid number of elements per pattern.
> >> >> >>
> >> >> >> So if (R) holds we can compute the result -- for both VLA and VLS -- 
> >> >> >> by
> >> >> >> calculating the first Pr*Er elements of the result and using the
> >> >> >> encoding to derive the rest.  If (R) doesn't hold then we need the
> >> >> >> selector to be constant-length.  We should then fill in the result
> >> >> >> based on:
> >> >> >>
> >> >> >> - Pr == number of elements in the result
> >> >> >> - Er == 1
> >> >> >>
> >> >> >> But this should be the fallback option, even for VLS.
> >> >> >>
> >> >> >> As far as the arguments go: we should reject 

Re: [PATCH] Ignore debug insns with CONCAT and CONCATN for insn scheduling

2022-09-26 Thread H.J. Lu via Gcc-patches
On Sat, Sep 24, 2022 at 1:37 PM Jeff Law  wrote:
>
>
> On 9/21/22 16:11, H.J. Lu wrote:
> > On Wed, Sep 7, 2022 at 10:03 AM Jeff Law via Gcc-patches
> >  wrote:
> >>
> >>
> >> On 9/2/2022 8:36 AM, H.J. Lu via Gcc-patches wrote:
> >>> CONCAT and CONCATN never appear in the insn chain.  They are only used
> >>> in debug insn.  Ignore debug insns with CONCAT and CONCATN for insn
> >>> scheduling to avoid different insn orders with and without debug insn.
> >>>
> >>> gcc/
> >>>
> >>>PR rtl-optimization/106746
> >>>* sched-deps.cc (sched_analyze_2): Ignore debug insns with CONCAT
> >>>and CONCATN.
> >> Shouldn't we be ignoring everything in a debug insn?   I don't see why
> >> CONCAT/CONCATN are special here.
> > Debug insns are processed by insn scheduling.   I think it is to improve 
> > debug
> > experiences.  It is just that there are no matching usages of CONCAT/CONCATN
> > in non-debug insns.
>
> But from a dependency standpoint ISTM all debug insn can be ignored.  I
> still don't see why concat/concatn should be special here.
>

I tried to ignore everything in a debug insn.  It caused many regressions in
the GCC testsuite.

-- 
H.J.


[PATCH v2] c++: Don't quote nothrow in diagnostic

2022-09-26 Thread Marek Polacek via Gcc-patches
On Mon, Sep 26, 2022 at 12:34:04PM -0400, Jason Merrill wrote:
> On 9/26/22 03:50, Richard Biener wrote:
> > On Fri, Sep 23, 2022 at 8:41 PM Marek Polacek via Gcc-patches
> >  wrote:
> > > 
> > > In 
> > > Jason noticed that we quote "nothrow" in diagnostics even though it's
> > > not a keyword in C++.  Just removing the quotes didn't work because
> > > then -Wformat-diag complains, so this patch replaces it with "no-throw".
> > > 
> > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > That doesn't look like an improvement to me.  Can we quote 'nothrow()' 
> > instead?

Understood.
 
> nothrow() is a syntax error; the C++11 keyword is 'noexcept'. std::nothrow
> is a dummy placement argument used to indicate that a new-expression should
> return null rather than throw on failure.
> 
> But bizarrely, the library traits use the word "nothrow".  Marek's patch
> clarifies that we are not trying to refer to anything in the language.
> 
> > I'd rather leave it alone than changing it to no-throw.  Why does 
> > -Wformat-diag
> > complain?  If we shouldn't quote nothrow that should be adjusted?
> 
> I think -Wformat-diag complains because "nothrow" is an attribute; it also
> includes some other attribute names in the list of "keywords".
> 
> I would also be fine with just removing the quotes and removing nothrow from
> c_keywords.

Like below?   Bootstrapped/regtested on x86_64-pc-linux-gnu.

Note that now I see warnings with my system compiler (gcc-12.2.1).  Can
I commit the c-format.cc hunk to gcc 12 so that eventually even gcc 12
stops warning?

-- >8 --
In 
Jason noticed that we quote "nothrow" in diagnostics even though it's
not a keyword in C++.  This patch removes the quotes and also drops
"nothrow" from c_keywords.

gcc/c-family/ChangeLog:

* c-format.cc (c_keywords): Drop nothrow.

gcc/cp/ChangeLog:

* constraint.cc (diagnose_trait_expr): Say "nothrow" without quotes
rather than in quotes.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-traits3.C: Adjust expected diagnostics.
---
 gcc/c-family/c-format.cc  |  3 +--
 gcc/cp/constraint.cc  | 14 +++---
 gcc/testsuite/g++.dg/cpp2a/concepts-traits3.C |  8 
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
index a6c380bf1c8..a2026591ed1 100644
--- a/gcc/c-family/c-format.cc
+++ b/gcc/c-family/c-format.cc
@@ -2900,7 +2900,7 @@ static const token_t cxx_opers[] =
   };
 
 /* Common C/C++ keywords that are expected to be quoted within the format
-   string.  Keywords like auto, inline, or volatile are exccluded because
+   string.  Keywords like auto, inline, or volatile are excluded because
they are sometimes used in common terms like /auto variables/, /inline
function/, or /volatile access/ where they should not be quoted.  */
 
@@ -2927,7 +2927,6 @@ static const token_t c_keywords[] =
NAME ("noinline", NULL),
NAME ("nonnull", NULL),
NAME ("noreturn", NULL),
-   NAME ("nothrow", NULL),
NAME ("offsetof", NULL),
NAME ("readonly", "read-only"),
NAME ("readwrite", "read-write"),
diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 5839bfb4b52..266ec581a20 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3592,13 +3592,13 @@ diagnose_trait_expr (tree expr, tree args)
   switch (TRAIT_EXPR_KIND (expr))
 {
 case CPTK_HAS_NOTHROW_ASSIGN:
-  inform (loc, "  %qT is not % copy assignable", t1);
+  inform (loc, "  %qT is not nothrow copy assignable", t1);
   break;
 case CPTK_HAS_NOTHROW_CONSTRUCTOR:
-  inform (loc, "  %qT is not % default constructible", t1);
+  inform (loc, "  %qT is not nothrow default constructible", t1);
   break;
 case CPTK_HAS_NOTHROW_COPY:
-  inform (loc, "  %qT is not % copy constructible", t1);
+  inform (loc, "  %qT is not nothrow copy constructible", t1);
   break;
 case CPTK_HAS_TRIVIAL_ASSIGN:
   inform (loc, "  %qT is not trivially copy assignable", t1);
@@ -3674,7 +3674,7 @@ diagnose_trait_expr (tree expr, tree args)
   inform (loc, "  %qT is not trivially assignable from %qT", t1, t2);
   break;
 case CPTK_IS_NOTHROW_ASSIGNABLE:
-  inform (loc, "  %qT is not % assignable from %qT", t1, t2);
+  inform (loc, "  %qT is not nothrow assignable from %qT", t1, t2);
   break;
 case CPTK_IS_CONSTRUCTIBLE:
   if (!t2)
@@ -3690,9 +3690,9 @@ diagnose_trait_expr (tree expr, tree args)
   break;
 case CPTK_IS_NOTHROW_CONSTRUCTIBLE:
   if (!t2)
-   inform (loc, "  %qT is not % default constructible", t1);
+   inform (loc, "  %qT is not nothrow default constructible", t1);
   else
-   inform (loc, "  %qT is not % constructible from %qE", t1, t2);
+   inform (loc,

Re: Extend fold_vec_perm to fold VEC_PERM_EXPR in VLA manner

2022-09-26 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni  writes:
> On Fri, 23 Sept 2022 at 21:33, Richard Sandiford
>  wrote:
>>
>> Prathamesh Kulkarni  writes:
>> > On Tue, 20 Sept 2022 at 18:09, Richard Sandiford
>> >  wrote:
>> >>
>> >> Prathamesh Kulkarni  writes:
>> >> > On Mon, 12 Sept 2022 at 19:57, Richard Sandiford
>> >> >  wrote:
>> >> >>
>> >> >> Prathamesh Kulkarni  writes:
>> >> >> >> The VLA encoding encodes the first N patterns explicitly.  The
>> >> >> >> npatterns/nelts_per_pattern values then describe how to extend that
>> >> >> >> initial sequence to an arbitrary number of elements.  So when 
>> >> >> >> performing
>> >> >> >> an operation on (potentially) variable-length vectors, the 
>> >> >> >> questions is:
>> >> >> >>
>> >> >> >> * Can we work out an initial sequence and 
>> >> >> >> npatterns/nelts_per_pattern
>> >> >> >>   pair that will be correct for all elements of the result?
>> >> >> >>
>> >> >> >> This depends on the operation that we're performing.  E.g. it's
>> >> >> >> different for unary operations (vector_builder::new_unary_operation)
>> >> >> >> and binary operations (vector_builder::new_binary_operations).  It 
>> >> >> >> also
>> >> >> >> varies between unary operations and between binary operations, hence
>> >> >> >> the allow_stepped_p parameters.
>> >> >> >>
>> >> >> >> For VEC_PERM_EXPR, I think the key requirement is that:
>> >> >> >>
>> >> >> >> (R) Each individual selector pattern must always select from the 
>> >> >> >> same vector.
>> >> >> >>
>> >> >> >> Whether this condition is met depends both on the pattern itself 
>> >> >> >> and on
>> >> >> >> the number of patterns that it's combined with.
>> >> >> >>
>> >> >> >> E.g. suppose we had the selector pattern:
>> >> >> >>
>> >> >> >>   { 0, 1, 4, ... }   i.e. 3x - 2 for x > 0
>> >> >> >>
>> >> >> >> If the arguments and selector are n elements then this pattern on 
>> >> >> >> its
>> >> >> >> own would select from more than one argument if 3(n-1) - 2 >= n.
>> >> >> >> This is clearly true for large enough n.  So if n is variable then
>> >> >> >> we cannot represent this.
>> >> >> >>
>> >> >> >> If the pattern above is one of two patterns, so interleaved as:
>> >> >> >>
>> >> >> >>  { 0, _, 1, _, 4, _, ... }  o=0
>> >> >> >>   or { _, 0, _, 1, _, 4, ... }  o=1
>> >> >> >>
>> >> >> >> then the pattern would select from more than one argument if
>> >> >> >> 3(n/2-1) - 2 + o >= n.  This too would be a problem for variable n.
>> >> >> >>
>> >> >> >> But if the pattern above is one of four patterns then it selects
>> >> >> >> from more than one argument if 3(n/4-1) - 2 + o >= n.  This is not
>> >> >> >> true for any valid n or o, so the pattern is OK.
>> >> >> >>
>> >> >> >> So let's define some ad hoc terminology:
>> >> >> >>
>> >> >> >> * Px is the number of patterns in x
>> >> >> >> * Ex is the number of elements per pattern in x
>> >> >> >>
>> >> >> >> where x can be:
>> >> >> >>
>> >> >> >> * 1: first argument
>> >> >> >> * 2: second argument
>> >> >> >> * s: selector
>> >> >> >> * r: result
>> >> >> >>
>> >> >> >> Then:
>> >> >> >>
>> >> >> >> (1) The number of elements encoded explicitly for x is Ex*Px
>> >> >> >>
>> >> >> >> (2) The explicit encoding can be used to produce a sequence of 
>> >> >> >> N*Ex*Px
>> >> >> >> elements for any integer N.  This extended sequence can be 
>> >> >> >> reencoded
>> >> >> >> as having N*Px patterns, with Ex staying the same.
>> >> >> >>
>> >> >> >> (3) If Ex < 3, Ex can be increased by 1 by repeating the final Px 
>> >> >> >> elements
>> >> >> >> of the explicit encoding.
>> >> >> >>
>> >> >> >> So let's assume (optimistically) that we can produce the result
>> >> >> >> by calculating the first Pr*Er elements and using the Pr,Er encoding
>> >> >> >> to imply the rest.  Then:
>> >> >> >>
>> >> >> >> * (2) means that, when combining multiple input operands with 
>> >> >> >> potentially
>> >> >> >>   different encodings, we can set the number of patterns in the 
>> >> >> >> result
>> >> >> >>   to the least common multiple of the number of patterns in the 
>> >> >> >> inputs.
>> >> >> >>   In this case:
>> >> >> >>
>> >> >> >>   Pr = least_common_multiple(P1, P2, Ps)
>> >> >> >>
>> >> >> >>   is a valid number of patterns.
>> >> >> >>
>> >> >> >> * (3) means that the number of elements per pattern of the result 
>> >> >> >> can
>> >> >> >>   be the maximum of the number of elements per pattern in the 
>> >> >> >> inputs.
>> >> >> >>   (Alternatively, we could always use 3.)  In this case:
>> >> >> >>
>> >> >> >>   Er = max(E1, E2, Es)
>> >> >> >>
>> >> >> >>   is a valid number of elements per pattern.
>> >> >> >>
>> >> >> >> So if (R) holds we can compute the result -- for both VLA and VLS 
>> >> >> >> -- by
>> >> >> >> calculating the first Pr*Er elements of the result and using the
>> >> >> >> encoding to derive the rest.  If (R) doesn't hold then we need the
>> >> >> >> selector to be constant-length.  We should then fill in the result
>> >> >> >> based on:
>> >> >> >>
>> >> >> >> - Pr 

Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-26 Thread Jason Merrill via Gcc-patches

On 9/19/22 12:39, Jakub Jelinek wrote:

On Sat, Sep 17, 2022 at 10:58:54AM +0200, Jason Merrill wrote:

I thought it is fairly important because __float128 has been around in GCC
for 19 years already.  To be precise, I think e.g. for x86_64 GCC 3.4
introduced it, but mangling was implemented only in GCC 4.1 (2006), before we 
ICEd
on those.  Until glibc 2.26 (2017) one had to use libquadmath when
math library functions were needed, but since then one can just use libm.
__float128 is on some targets (e.g. PA) just another name for long double,
not a distinct type.


I think we certainly want to continue to support __float128, what I'm
wondering is how much changing it to mean _Float128 will affect existing
code.  I would guess that a lot of code that just works on __float128 will
continue to work without modification.  Does anyone know of significant
existing uses of __float128?


I know boost uses it in its cstdfloat.hpp and stuff that uses it.


Another thing are the PowerPC __ieee128 and __ibm128 type, I think for the
former we can't make it the same type as _Float128, because e.g. libstdc++
code relies on __ieee128 and __ibm128 being long double type of the other
ABI, so they should mangle as long double of the other ABI.  But in that
case they can't act as distinct types when long double should mangle the
same as they do.  And it would be weird if those types in one
-mabi=*longdouble mode worked as standard floating-point type and in another
as extended floating-point type, rather than just types which are neither
standard nor extended as before.


Absolutely we don't want to mess with __ieee128 and __ibm128.  And I guess
that means that we need to preserve the non-standard type handling for the
alternate long double.

I think we can still change __float128 to be _Float128 on PPC and other
targets where it's currently an alias for long double.

It seems to me that it's a question of what provides the better transition
path for users.  I imagine we'll want to encourage people to replace
__float128 with std::float128_t everywhere.

In the existing model, it's not portable whether

void f(long double) { }
void f(__float128) { }

is an overload or an erroneous redefinition.  In the new model, you can
portably write

void f(long double) { }
void f(std::float128_t) { }

and existing __float128 code will call the second one.  Old code that had
conditional __float128 overloads when it's different from long double will
need to change to have unconditional _Float128 overloads.

If we don't change __float128 to mean _Float128, we require fewer immediate
changes for a library that does try to support all floating-point types, but
it will need changes to support _Float128 and will need to keep around
conditional __float128 overloads indefinitely.


I agree that we should judge on what makes the forward path for users
better.
I just think we serve users better if we keep __float128 as is, it will be
then better consistent with other weird types like __float80 (on x86/ia64,
same representation/mode as long double, but distinct with separate
mangling), __fpreg on ia64 etc.  It is true that whether __float128 is
distinct or same type as other standard or non-standard types right now
differs from target to target (on x86 obviously it is distinct from all
other currently supported types because we didn't have other IEEE quad
type but __float80 was distinct even if we had one, ia64 has distinct
__float128 unless it is HP-UX where it is same type as long double,
on PA it is same as long double if __float128 exists at all, on powerpc
it is a define to __ieee128 right now where __ieee128 is same type as long
double in the -mabi=ieeelongdouble, and distinct type otherwise (but only
with -mvsx, otherwise it isn't supported (on by default for ppc64le but not
the others)).  So, code that wants to overload with __float128 or use
__float128 in template arguments needs to do some ifdefs to find out
what it should do.  But, if we make __float128 the same as _Float128,
I think it will mean people actually need to make the conditions even more
complex, because what __float128 will be and how it will behave will depend
on the compiler version.
I believe libraries with floating point stuff will want to add
std::{float{16,32,64,128},bfloat16}_t overloads eventually, not just
std::float128_t overloads, and it will be better if it uses the standard
f{16,32,64,128} literal suffixes, builtins, library functions etc. rather
than Q suffixes, *q builtins, libquadmath functions etc.
Say on x86 it would be inconsistent if __float80 remains to be some
non-standard type, but __float128 is now extended type.  And when
__float128 and _Float128 is distinct, we can keep clean DF128_ mangling
for the latter.


Fair enough.  And the other responders on the ABI PR seem to agree with you.


Anyway, here is an updated patch that adds also _Float{32,64,128}x support
with DF{32,64,128}x mangling and demangling and the conv->bad_p + pedwarn
change.  __float12

Re: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator support.

2022-09-26 Thread Jason Merrill via Gcc-patches

On 9/8/22 20:45, Eugene Rozenfeld wrote:

Jason,

Thank for your suggestion. The patch is updated (attached).



@@ -467,12 +471,19 @@ lto_location_cache::apply_location_cache ()
current_loc = set_block (current_loc, loc.block);
  else
current_loc = LOCATION_LOCUS (current_loc);
+ if (loc.discr)
+   current_loc = location_with_discriminator (current_loc, loc.discr);
+   }
+  else if (current_discr != loc.discr)
+   {
+   current_loc = location_with_discriminator (current_loc, loc.discr);
}


We usually don't use { } around a single line.


@@ -1180,6 +1206,7 @@ assign_discriminators (void)
   location_t locus = last ? gimple_location (last) : UNKNOWN_LOCATION;
 
   if (locus == UNKNOWN_LOCATION)

+
continue;


Stray newline.

OK with those tweaks.

Jason



Ping^3: [PATCH] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2022-09-26 Thread Lewis Hyatt via Gcc-patches
On Wed, Jun 15, 2022 at 03:06:16PM -0400, Lewis Hyatt wrote:
> On Tue, Jun 14, 2022 at 05:26:49PM -0400, Lewis Hyatt wrote:
> > Hello-
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902
> > 
> > The attached patch resolves PR preprocessor/103902 as described in the patch
> > message inline below. bootstrap + regtest all languages was successful on
> > x86-64 Linux, with no new failures:
> > 
> > FAIL 103 103
> > PASS 542338 542371
> > UNSUPPORTED 15247 15250
> > UNTESTED 136 136
> > XFAIL 4166 4166
> > XPASS 17 17
> > 
> > Please let me know if it looks OK?
> > 
> > A few questions I have:
> > 
> > - A difference introduced with this patch is that after lexing something
> > like `operator ""_abc', then `_abc' is added to the identifier hash map,
> > whereas previously it was not. I feel like this must be OK because with the
> > optional space as in `operator "" _abc', it would be added with or without 
> > the
> > patch.
> > 
> > - The behavior of `#pragma GCC poison' is not consistent (including prior to
> >   my patch). I tried to make it more so but there is still one thing I want 
> > to
> >   ask about. Leaving aside extended characters for now, the inconsistency is
> >   that currently the poison is only checked, when the suffix appears as a
> >   standalone token.
> > 
> >   #pragma GCC poison _X
> >   bool operator ""_X (unsigned long long);   //accepted before the patch,
> >  //rejected after it
> >   bool operator "" _X (unsigned long long);  //rejected either before or 
> > after
> >   const char * operator ""_X (const char *, unsigned long); //accepted 
> > before,
> > //rejected after
> >   const char * operator "" _X (const char *, unsigned long); //rejected 
> > either
> > 
> >   const char * s = ""_X; //accepted before the patch, rejected after it
> >   const bool b = 1_X; //accepted before or after 
> > 
> > I feel like after the patch, the behavior is the expected behavior for all
> > cases but the last one. Here, we allow the poisoned identifier because it's
> > not lexed as an identifier, it's lexed as part of a pp-number. Does it seem 
> > OK
> > like this or does it need to be addressed?
> 
> Sorry, that version actually did not handle the case of -Wc++11-compat in
> c++98 mode correctly. This updated version fixes that and adds the missing
> test coverage for that, if you could please review this one instead?
> 
> By the way, the pipermail archive seems to permanently mangle UTF-8 in inline
> attachments. I attached the patch also gzipped to address that for the
> archive, since the new testcases do use non-ASCII characters.
> 
> Thanks for taking a look!

Hello-

May I please ping this patch again? Joseph suggested that it would be best if
a C++ maintainer has a look at it. This is one of just a few places left where
we don't handle UTF-8 properly in libcpp, it would be really nice to get them
fixed up if there is time to review this patch. Thanks!

https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596704.html

I re-attached it here as it required some trivial rebasing on top of recently
pushed changes. As before, I also attached the gzipped version so that the
UTF-8 testcases show up OK in the online archive, in case that's still an
issue. Thanks for taking a look!

-Lewis
[PATCH] libcpp: Handle extended characters in user-defined literal suffix 
[PR103902]

The PR complains that we do not handle UTF-8 in the suffix for a user-defined
literal, such as:

bool operator ""_π (unsigned long long);

In fact we don't handle any extended identifier characters there, whether
UTF-8, UCNs, or the $ sign. We do handle it fine if the optional space after
the "" tokens is included, since then the identifier is lexed in the "normal"
way as its own token. But when it is lexed as part of the string token, this
is handled in lex_string() with a one-off loop that is not aware of extended
characters.

This patch fixes it by adding a new function scan_cur_identifier() that can be
used to lex an identifier while in the middle of lexing another token. It is
somewhat duplicative of the code in lex_identifier(), which handles the normal
case, but I think there's no good way to avoid that without pessimizing the
usual case, since lex_identifier() takes advantage of the fact that the first
character of the identifier has already been analyzed. The code duplication is
somewhat offset by factoring out the identifier lexing diagnostics (e.g. for
poisoned identifiers), which were formerly duplicated in two places, and have
been factored into their own function that's used in (now) 3 places.

BTW, the other place that was lexing identifiers is lex_identifier_intern(),
which is used to implement #pragma push_macro and #pragma pop_macro. This does
not support extended characters either. I will add that in a subsequent patch,
because it can't directly reuse the new function, but rather needs to le

[committed] libstdc++: Update std::pointer_traits to match new LWG 3545 wording

2022-09-26 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

It was pointed out in recent LWG 3545 discussion that having a
constrained partial specialization of std::pointer_traits can cause
ambiguities with program-defined specializations. For example, the
addition to the testcase has:

template requires std::derived_from;

This would be ambiguous with the library's own constrained partial
specialization:

template requires requires { typename Ptr::element_type; }
struct std::pointer_traits;

Neither specialization is more specialized than the other for a type
that is derived from base_type and also has an element_type member.

The solution is to remove the library's partial specialization, and do
the check for Ptr::element_type in the __ptr_traits_elem helper (which
is what we already do for !__cpp_concepts anyway).

libstdc++-v3/ChangeLog:

* include/bits/ptr_traits.h (__ptr_traits_elem) [__cpp_concepts]:
Also define the __ptr_traits_elem class template for the
concepts case.
(pointer_traits): Remove constrained partial
specialization.
* testsuite/20_util/pointer_traits/lwg3545.cc: Check for
ambiguitiy with program-defined partial specialization.
---
 libstdc++-v3/include/bits/ptr_traits.h| 20 ++-
 .../20_util/pointer_traits/lwg3545.cc | 17 
 2 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/libstdc++-v3/include/bits/ptr_traits.h 
b/libstdc++-v3/include/bits/ptr_traits.h
index ae8810706ab..71370ff4fc9 100644
--- a/libstdc++-v3/include/bits/ptr_traits.h
+++ b/libstdc++-v3/include/bits/ptr_traits.h
@@ -73,25 +73,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __replace_first_arg<_SomeTemplate<_Tp, _Types...>, _Up>
 { using type = _SomeTemplate<_Up, _Types...>; };
 
-#if __cpp_concepts
-  // When concepts are supported detection of _Ptr::element_type is done
-  // by a requires-clause, so __ptr_traits_elem_t only needs to do this:
-  template
-using __ptr_traits_elem_t = typename __get_first_arg<_Ptr>::type;
-#else
   // Detect the element type of a pointer-like type.
   template
 struct __ptr_traits_elem : __get_first_arg<_Ptr>
 { };
 
   // Use _Ptr::element_type if is a valid type.
+#if __cpp_concepts
+  template requires requires { typename _Ptr::element_type; }
+struct __ptr_traits_elem<_Ptr, void>
+{ using type = typename _Ptr::element_type; };
+#else
   template
 struct __ptr_traits_elem<_Ptr, __void_t>
 { using type = typename _Ptr::element_type; };
+#endif
 
   template
 using __ptr_traits_elem_t = typename __ptr_traits_elem<_Ptr>::type;
-#endif
 
   /// @endcond
 
@@ -182,13 +181,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct pointer_traits : __ptr_traits_impl<_Ptr, __ptr_traits_elem_t<_Ptr>>
 { };
 
-#if __cpp_concepts
-  template requires requires { typename _Ptr::element_type; }
-struct pointer_traits<_Ptr>
-: __ptr_traits_impl<_Ptr, typename _Ptr::element_type>
-{ };
-#endif
-
   /**
* @brief  Partial specialization for built-in pointers.
* @headerfile memory
diff --git a/libstdc++-v3/testsuite/20_util/pointer_traits/lwg3545.cc 
b/libstdc++-v3/testsuite/20_util/pointer_traits/lwg3545.cc
index 08c3ed01b75..93c64a353bd 100644
--- a/libstdc++-v3/testsuite/20_util/pointer_traits/lwg3545.cc
+++ b/libstdc++-v3/testsuite/20_util/pointer_traits/lwg3545.cc
@@ -99,3 +99,20 @@ static_assert( is_same, 
clever_ptr>::value, "" );
 static_assert( is_same, std::ptrdiff_t>::value, "" );
 static_assert( is_same, clever_ptr>::value, "" );
 static_assert( is_same, clever_ptr>::value, "" );
+
+#ifdef __cpp_concepts
+struct ptr_base { };
+
+// Program-defined specialization must not be ambiguous with primary template.
+template requires std::derived_from
+struct std::pointer_traits
+{
+  using element_type = int;
+  using difference_type = long;
+  using pointer = P;
+};
+
+struct Ptr : ptr_base { using element_type = int; };
+
+using E = std::pointer_traits::element_type;
+#endif
-- 
2.37.3



[committed] libstdc++: Use new built-ins for std::is_convertible traits

2022-09-26 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_convertible, is_convertible_v):
Define using new built-in.
(is_nothrow_convertible is_nothrow_convertible_v): Likewise.
---
 libstdc++-v3/include/std/type_traits | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index c5853fcad90..1ac805152d4 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1381,6 +1381,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public integral_constant
 { };
 
+#if __has_builtin(__is_convertible)
+  template
+struct is_convertible
+: public __bool_constant<__is_convertible(_From, _To)>
+{ };
+#else
   template, is_function<_To>,
 is_array<_To>>::value>
@@ -1416,12 +1422,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_convertible
 : public __is_convertible_helper<_From, _To>::type
 { };
+#endif
 
   // helper trait for unique_ptr, shared_ptr, and span
   template
 using __is_array_convertible
   = is_convertible<_FromElementType(*)[], _ToElementType(*)[]>;
 
+#if __cplusplus >= 202002L
+#define __cpp_lib_is_nothrow_convertible 201806L
+
+#if __has_builtin(__is_nothrow_convertible)
+  /// is_nothrow_convertible_v
+  template
+inline constexpr bool is_nothrow_convertible_v
+  = __is_nothrow_convertible(_From, _To);
+
+  /// is_nothrow_convertible
+  template
+struct is_nothrow_convertible
+: public bool_constant>
+{ };
+#else
   template, is_function<_To>,
 is_array<_To>>::value>
@@ -1451,8 +1473,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 #pragma GCC diagnostic pop
 
-#if __cplusplus > 201703L
-#define __cpp_lib_is_nothrow_convertible 201806L
   /// is_nothrow_convertible
   template
 struct is_nothrow_convertible
@@ -1463,6 +1483,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 inline constexpr bool is_nothrow_convertible_v
   = is_nothrow_convertible<_From, _To>::value;
+#endif
 #endif // C++2a
 
   // Const-volatile modifications.
@@ -3265,7 +3286,7 @@ template 
 template 
   inline constexpr bool is_base_of_v = __is_base_of(_Base, _Derived);
 template 
-  inline constexpr bool is_convertible_v = is_convertible<_From, _To>::value;
+  inline constexpr bool is_convertible_v = __is_convertible(_From, _To);
 template
   inline constexpr bool is_invocable_v = is_invocable<_Fn, _Args...>::value;
 template
-- 
2.37.3



Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-26 Thread Jakub Jelinek via Gcc-patches
On Mon, Sep 26, 2022 at 05:15:12PM -0400, Jason Merrill wrote:
> > Anyway, here is an updated patch that adds also _Float{32,64,128}x support
> > with DF{32,64,128}x mangling and demangling and the conv->bad_p + pedwarn
> > change.  __float128 is still distinct from _Float128.
> 
> Looks good with the minor adjustments below.

Thanks for the review.  Below is updated patch that I'll now retest
on x86_64-linux, i686-linux and powerpc64{,le}-linux before committing.

> Will you also add the  header and the changes to ?

I have a partial patch that I'll perhaps make some progress on and will
incrementally regtest it, but there are parts which I'd prefer
to defer to Jonathan.

> Please add a comment to explain what you're doing with signed/unsigned
> arithmetic here.
> 
> > +   if (cp_compare_floating_point_conversion_ranks (fp1, fp2) + 1U
> > +   <= 2U)

While I've added a comment, it occurred to me that these would be more
readable using the IN_RANGE macro (above is IN_RANGE (cp_compare..., -1, 1))
which under the hood does the same thing.

> > + {
> > +   /* Conversion ranks of FP1 and FP2 are equal.  */
> > +   if (TREE_CODE (t3) != REAL_TYPE
> > +   || (cp_compare_floating_point_conversion_ranks (fp1, t3)
> > +   + 1U > 2U))

and this is !IN_RANGE (cp_compare..., -1, 1),
so I've changed that in the patch too.

2022-09-27  Jakub Jelinek  

PR c++/106652
PR c++/85518
gcc/
* tree-core.h (enum tree_index): Add TI_FLOAT128T_TYPE
enumerator.
* tree.h (float128t_type_node): Define.
* tree.cc (build_common_tree_nodes): Initialize float128t_type_node.
* builtins.def (DEF_FLOATN_BUILTIN): Adjust comment now that
_Float is supported in C++ too.
* config/i386/i386.cc (ix86_mangle_type): Only mangle as "g"
float128t_type_node.
* config/i386/i386-builtins.cc (ix86_init_builtin_types): Use
float128t_type_node for __float128 instead of float128_type_node
and create it if NULL.
* config/i386/avx512fp16intrin.h (_mm_setzero_ph, _mm256_setzero_ph,
_mm512_setzero_ph, _mm_set_sh, _mm_load_sh): Use 0.0f16 instead of
0.0f.
* config/ia64/ia64.cc (ia64_init_builtins): Use
float128t_type_node for __float128 instead of float128_type_node
and create it if NULL.
* config/rs6000/rs6000-c.cc (is_float128_p): Also return true
for float128t_type_node if non-NULL.
* config/rs6000/rs6000.cc (rs6000_mangle_type): Don't mangle
float128_type_node as "u9__ieee128".
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Use
float128t_type_node for __float128 instead of float128_type_node
and create it if NULL.
gcc/c-family/
* c-common.cc (c_common_reswords): Change _Float{16,32,64,128} and
_Float{32,64,128}x flags from D_CONLY to 0.
(shorten_binary_op): Punt if common_type returns error_mark_node.
(shorten_compare): Likewise.
(c_common_nodes_and_builtins): For C++ record _Float{16,32,64,128}
and _Float{32,64,128}x builtin types if available.  For C++
clear float128t_type_node.
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__STDCPP_FLOAT{16,32,64,128}_T__ for C++23 if supported.
* c-lex.cc (interpret_float): For q/Q suffixes prefer
float128t_type_node over float128_type_node.  Allow
{f,F}{16,32,64,128} suffixes for C++ if supported with pedwarn
for C++20 and older.  Allow {f,F}{32,64,128}x suffixes for C++
with pedwarn.  Don't call excess_precision_type for C++.
gcc/cp/
* cp-tree.h (cp_compare_floating_point_conversion_ranks): Implement
P1467R9 - Extended floating-point types and standard names except
for std::bfloat16_t for now.  Declare.
(extended_float_type_p): New inline function.
* mangle.cc (write_builtin_type): Mangle float{16,32,64,128}_type_node
as DF{16,32,64,128}_.  Mangle float{32,64,128}x_type_node as
DF{32,64,128}x.  Remove FIXED_POINT_TYPE mangling that conflicts
with that.
* typeck2.cc (check_narrowing): If one of ftype or type is extended
floating-point type, compare floating-point conversion ranks.
* parser.cc (cp_keyword_starts_decl_specifier_p): Handle
CASE_RID_FLOATN_NX.
(cp_parser_simple_type_specifier): Likewise and diagnose missing
_Float or _Floatx support if not supported by target.
* typeck.cc (cp_compare_floating_point_conversion_ranks): New function.
(cp_common_type): If both types are REAL_TYPE and one or both are
extended floating-point types, select common type based on comparison
of floating-point conversion ranks and subranks.
(cp_build_binary_op): Diagnose operation with floating point arguments
with unordered conversion ranks.
* call.cc (standard_conversion): For fl

Re: [PATCH] Ignore debug insns with CONCAT and CONCATN for insn scheduling

2022-09-26 Thread Jeff Law via Gcc-patches



On 9/26/22 13:52, H.J. Lu wrote:

On Sat, Sep 24, 2022 at 1:37 PM Jeff Law  wrote:


On 9/21/22 16:11, H.J. Lu wrote:

On Wed, Sep 7, 2022 at 10:03 AM Jeff Law via Gcc-patches
 wrote:


On 9/2/2022 8:36 AM, H.J. Lu via Gcc-patches wrote:

CONCAT and CONCATN never appear in the insn chain.  They are only used
in debug insn.  Ignore debug insns with CONCAT and CONCATN for insn
scheduling to avoid different insn orders with and without debug insn.

gcc/

PR rtl-optimization/106746
* sched-deps.cc (sched_analyze_2): Ignore debug insns with CONCAT
and CONCATN.

Shouldn't we be ignoring everything in a debug insn?   I don't see why
CONCAT/CONCATN are special here.

Debug insns are processed by insn scheduling.   I think it is to improve debug
experiences.  It is just that there are no matching usages of CONCAT/CONCATN
in non-debug insns.

But from a dependency standpoint ISTM all debug insn can be ignored.  I
still don't see why concat/concatn should be special here.


I tried to ignore everything in a debug insn.  It caused many regressions in
the GCC testsuite.

Not terribly useful -- what failed and why?

jeff


Re: [COMMITTED] Optimize [0 = x & MASK] in range-ops.

2022-09-26 Thread Jeff Law via Gcc-patches



On 9/26/22 11:24, Aldy Hernandez via Gcc-patches wrote:

For [0 = x & MASK], we can determine that x is ~MASK.  This is
something we're picking up in DOM thanks to maybe_set_nonzero_bits,
but is something we should handle natively.

This is a good example of how much easier to maintain the range-ops
entries are versus the ad-hoc pattern matching stuff we had to do
before.  For the curious, compare the changes to range-op here,
versus maybe_set_nonzero_bits.

I'm leaving the call to maybe_set_nonzero_bits until I can properly
audit it to make sure we're catching it all in range-ops.  It won't
hurt, since both set_range_info() and set_nonzero_bits() are
intersect operations, so we'll never lose information if we do both.

Tested on x86-64 Linux.

PR tree-optimization/107009

gcc/ChangeLog:

* range-op.cc (operator_bitwise_and::op1_range): Optimize 0 = x & MASK.
(range_op_bitwise_and_tests): New test.


Umm,


0 = x & MASK;


Just means that X has no bits set in MASK.   So you can use it to set 
nonzero-bits to ~MASK like your patch does and you can use that to 
refine a result.  So it's really the comment that is misleading/wrong.



jeff



Re: [PATCH] Ignore debug insns with CONCAT and CONCATN for insn scheduling

2022-09-26 Thread Jakub Jelinek via Gcc-patches
On Mon, Sep 26, 2022 at 05:23:45PM -0600, Jeff Law via Gcc-patches wrote:
> 
> On 9/26/22 13:52, H.J. Lu wrote:
> > On Sat, Sep 24, 2022 at 1:37 PM Jeff Law  wrote:
> > > 
> > > On 9/21/22 16:11, H.J. Lu wrote:
> > > > On Wed, Sep 7, 2022 at 10:03 AM Jeff Law via Gcc-patches
> > > >  wrote:
> > > > > 
> > > > > On 9/2/2022 8:36 AM, H.J. Lu via Gcc-patches wrote:
> > > > > > CONCAT and CONCATN never appear in the insn chain.  They are only 
> > > > > > used
> > > > > > in debug insn.  Ignore debug insns with CONCAT and CONCATN for insn
> > > > > > scheduling to avoid different insn orders with and without debug 
> > > > > > insn.
> > > > > > 
> > > > > > gcc/
> > > > > > 
> > > > > > PR rtl-optimization/106746
> > > > > > * sched-deps.cc (sched_analyze_2): Ignore debug insns with 
> > > > > > CONCAT
> > > > > > and CONCATN.
> > > > > Shouldn't we be ignoring everything in a debug insn?   I don't see why
> > > > > CONCAT/CONCATN are special here.
> > > > Debug insns are processed by insn scheduling.   I think it is to 
> > > > improve debug
> > > > experiences.  It is just that there are no matching usages of 
> > > > CONCAT/CONCATN
> > > > in non-debug insns.
> > > But from a dependency standpoint ISTM all debug insn can be ignored.  I
> > > still don't see why concat/concatn should be special here.
> > > 
> > I tried to ignore everything in a debug insn.  It caused many regressions in
> > the GCC testsuite.
> Not terribly useful -- what failed and why?

I think the design for debug insns in the scheduler is that they do affect
scheduling decisions, but what is in debug insns should only affect actual
scheduling of the debug insns and not the rest.
So it wouldn't surprise me if ignoring everything in a debug insn broke a
lot.  But I admit I never fully understood how it works, hopefully Alex or
Vlad do.

Jakub



Re: [PATCH] Avoid depending on destructor order

2022-09-26 Thread Jason Merrill via Gcc-patches

On 9/23/22 10:12, Thomas Neumann wrote:


    +static const bool in_shutdown = false;

I'll let Jason or others decide if this is the right solution.  It 
seems that in_shutdown also could be declared outside the #ifdef and 
initialized as "false".


sure, either is fine. Moving it outside the #ifdef wastes one byte in 
the executable (while the compiler can eliminate the const), but it does 
not really matter.


Might as well go with your patch, then, adding a comment to explain why 
the variable is defined in two places.


I have verified that the patch below fixes builds for both fast-path and 
non-fast-path builds. But if you prefer I will move the in_shutdown 
definition instead.


Best

Thomas

PS: in_shutdown is an int here instead of a bool because non-fast-path 
builds do not include stdbool. Not a good reason, of course, but I 
wanted to keep the patch minimal and it makes no difference in practice.



     When using the atomic fast path deregistering can fail during
     program shutdown if the lookup structures are already destroyed.
     The assert in __deregister_frame_info_bases takes that into
     account. In the non-fast-path case however is not aware of
     program shutdown, which caused a compiler error on such platforms.
     We fix that by introducing a constant for in_shutdown in
     non-fast-path builds.

     libgcc/ChangeLog:
     * unwind-dw2-fde.c: Introduce a constant for in_shutdown
     for the non-fast-path case.

diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
index d237179f4ea..0bcd5061d76 100644
--- a/libgcc/unwind-dw2-fde.c
+++ b/libgcc/unwind-dw2-fde.c
@@ -67,6 +67,8 @@ static void
  init_object (struct object *ob);

  #else
+/* Without fast path frame deregistration must always succeed.  */
+static const int in_shutdown = 0;

  /* The unseen_objects list contains objects that have been registered
     but not yet categorized in any way.  The seen_objects list has had





[PATCH v5 1/2] asan: specify alignment for LASANPC labels

2022-09-26 Thread Ilya Leoshkevich via Gcc-patches
gcc/ChangeLog:

2020-06-30  Ilya Leoshkevich  

* asan.cc (asan_emit_stack_protection): Use CODE_LABEL_BOUNDARY.
* defaults.h (CODE_LABEL_BOUNDARY): New macro.
* doc/tm.texi: Document CODE_LABEL_BOUNDARY.
* doc/tm.texi.in: Likewise.
---
 gcc/asan.cc| 1 +
 gcc/defaults.h | 5 +
 gcc/doc/tm.texi| 4 
 gcc/doc/tm.texi.in | 4 
 4 files changed, 14 insertions(+)

diff --git a/gcc/asan.cc b/gcc/asan.cc
index 8276f12cc69..62f50ee769b 100644
--- a/gcc/asan.cc
+++ b/gcc/asan.cc
@@ -1960,6 +1960,7 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned 
int alignb,
   DECL_INITIAL (decl) = decl;
   TREE_ASM_WRITTEN (decl) = 1;
   TREE_ASM_WRITTEN (id) = 1;
+  SET_DECL_ALIGN (decl, CODE_LABEL_BOUNDARY);
   emit_move_insn (mem, expand_normal (build_fold_addr_expr (decl)));
   shadow_base = expand_binop (Pmode, lshr_optab, base,
  gen_int_shift_amount (Pmode, ASAN_SHADOW_SHIFT),
diff --git a/gcc/defaults.h b/gcc/defaults.h
index 953605c1627..52a471cf08e 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1455,4 +1455,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 typedef TARGET_UNIT target_unit;
 #endif
 
+/* Alignment required for a code label, in bits.  */
+#ifndef CODE_LABEL_BOUNDARY
+#define CODE_LABEL_BOUNDARY BITS_PER_UNIT
+#endif
+
 #endif  /* ! GCC_DEFAULTS_H */
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 858bfb80cec..cc588ee23b5 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1075,6 +1075,10 @@ to a value equal to or larger than @code{STACK_BOUNDARY}.
 Alignment required for a function entry point, in bits.
 @end defmac
 
+@defmac CODE_LABEL_BOUNDARY
+Alignment required for a code label, in bits.
+@end defmac
+
 @defmac BIGGEST_ALIGNMENT
 Biggest alignment that any data type can require on this machine, in
 bits.  Note that this is not the biggest alignment that is supported,
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 21b849ea32a..a0b725b0685 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -971,6 +971,10 @@ to a value equal to or larger than @code{STACK_BOUNDARY}.
 Alignment required for a function entry point, in bits.
 @end defmac
 
+@defmac CODE_LABEL_BOUNDARY
+Alignment required for a code label, in bits.
+@end defmac
+
 @defmac BIGGEST_ALIGNMENT
 Biggest alignment that any data type can require on this machine, in
 bits.  Note that this is not the biggest alignment that is supported,
-- 
2.37.2



[PATCH v5 0/2] IBM zSystems: Improve storing asan frame_pc

2022-09-26 Thread Ilya Leoshkevich via Gcc-patches
Hi,

This is a resend of v4 with slightly adjusted commit messages:

v1: https://gcc.gnu.org/pipermail/gcc-patches/2019-July/525016.html
v2: https://gcc.gnu.org/pipermail/gcc-patches/2019-July/525069.html
v3: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548338.html
v4: https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549252.html

It still survives the bootstrap and the regtest on x86_64-redhat-linux,
s390x-redhat-linux and ppc64le-redhat-linux.  It also fixes [1].

I also tried the approach with moving .LASANPC closer to the function
label and using FUNCTION_BOUNDARY instead of introducing
CODE_LABEL_BOUNDARY, but the problem there is that it's hard to catch
the moment where the function label is written.  Architectures can do
it by calling ASM_OUTPUT_LABEL() or assemble_name() in
ASM_DECLARE_FUNCTION_NAME(), ASM_OUTPUT_FUNCTION_LABEL() or
TARGET_ASM_FUNCTION_PROLOGUE().  epiphany_start_function() does that
twice, but passes the same decl to both calls.  Note that simply
moving asan_function_start() to final_start_function_1() is not enough,
since an architecture can write something after the function label.
This all means that for this approach to work, all the architectures
need to be adjusted, which looks like an overkill to me.

Best regards,
Ilya

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593666.html


Ilya Leoshkevich (2):
  asan: specify alignment for LASANPC labels
  IBM zSystems: Define CODE_LABEL_BOUNDARY

 gcc/asan.cc|  1 +
 gcc/config/s390/s390.h |  3 +++
 gcc/defaults.h |  5 +
 gcc/doc/tm.texi|  4 
 gcc/doc/tm.texi.in |  4 
 gcc/testsuite/gcc.target/s390/asan-no-gotoff.c | 15 +++
 6 files changed, 32 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/asan-no-gotoff.c

-- 
2.37.2



[PATCH v5 2/2] IBM zSystems: Define CODE_LABEL_BOUNDARY

2022-09-26 Thread Ilya Leoshkevich via Gcc-patches
Currently s390 emits the following sequence to store a frame_pc:

a:
.LASANPC0:

lg  %r1,.L5-.L4(%r13)
la  %r1,0(%r1,%r12)
stg %r1,176(%r11)

.L5:
.quad   .LASANPC0@GOTOFF

The reason GOT indirection is used instead of larl is that gcc does not
know that .LASANPC0, being a code label, is aligned on a 2-byte
boundary, and larl can load only even addresses.

Define CODE_LABEL_BOUNDARY in order to get rid of GOT indirection:

larl%r1,.LASANPC0
stg %r1,176(%r11)

gcc/ChangeLog:

2020-06-30  Ilya Leoshkevich  

* config/s390/s390.h (CODE_LABEL_BOUNDARY): Specify that s390
requires code labels to be aligned on a 2-byte boundary.

gcc/testsuite/ChangeLog:

2019-06-30  Ilya Leoshkevich  

* gcc.target/s390/asan-no-gotoff.c: New test.
---
 gcc/config/s390/s390.h |  3 +++
 gcc/testsuite/gcc.target/s390/asan-no-gotoff.c | 15 +++
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/asan-no-gotoff.c

diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index be566215df2..7d078ce6868 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -368,6 +368,9 @@ extern const char *s390_host_detect_local_cpu (int argc, 
const char **argv);
 /* Allocation boundary (in *bits*) for the code of a function.  */
 #define FUNCTION_BOUNDARY 64
 
+/* Alignment required for a code label, in bits.  */
+#define CODE_LABEL_BOUNDARY 16
+
 /* There is no point aligning anything to a rounder boundary than this.  */
 #define BIGGEST_ALIGNMENT 64
 
diff --git a/gcc/testsuite/gcc.target/s390/asan-no-gotoff.c 
b/gcc/testsuite/gcc.target/s390/asan-no-gotoff.c
new file mode 100644
index 000..f555e4e96f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/asan-no-gotoff.c
@@ -0,0 +1,15 @@
+/* Test that ASAN labels are referenced without unnecessary indirections.  */
+
+/* { dg-do compile } */
+/* { dg-options "-fPIE -O2 -fsanitize=kernel-address --param asan-stack=1" } */
+
+extern void c (int *);
+
+void a ()
+{
+  int b;
+  c (&b);
+}
+
+/* { dg-final { scan-assembler {\tlarl\t%r\d+,\.LASANPC\d+} } } */
+/* { dg-final { scan-assembler-not {\.LASANPC\d+@GOTOFF} } } */
-- 
2.37.2



  1   2   >