date:20211110

Re: [PATCH] pch: Add support for PCH for relocatable executables

2021-11-10 Thread Iain Sandoe




> On 9 Nov 2021, at 12:18, Jakub Jelinek via Gcc-patches 
>  wrote:
> 
> On Tue, Nov 09, 2021 at 11:40:08AM +, Iain Sandoe wrote:
>> There were two issues, of which one remains and probably affects all targets.
>> 
>> 1.  The Darwin PCH memory allocation scheme used a system that works reliably
>>for no-PIE but not for PIE
>> 
>> .. I hacked in a similar scheme to the mmap one used on Linux .. the suspect 
>> stuff
>>   there is in choosing some place in the map that is likely to succeed…
>> 
>>  With that I get passes on all c-family pch.exp (I didn’t try to bootstrap).
> 
> Yeah, certainly.

Overnight testing for i686, powerpc and x86_64 darwin suggests I’ve found some
suitable compromise map addresses (but that scheme has always seemed a bit
fragile if the ASLR parameters get updated for a new OS edition).

>> 2. This problem remains.
>> 
>>  - if we try to emit a diagnostic when the PCH read-in has failed, it seems 
>> that
>>   cc1 hangs somewhere in trying to lookup line table info.
>> 
>> - this was happening with the Darwin fixed PCH memory address because it
>>   was trying to report a fatal error in being unable to read the file (or 
>> trying to
>>  execute fancy_abort, in response to a segv).
> 
> I guess once we:
>  /* Read in all the scalar variables.  */
>  for (rt = gt_pch_scalar_rtab; *rt; rt++)
>for (rti = *rt; rti->base != NULL; rti++)
>  if (fread (rti->base, rti->stride, 1, f) != 1)
>fatal_error (input_location, "cannot read PCH file: %m");
> 
>  /* Read in all the global pointers, in 6 easy loops.  */
>  for (rt = gt_ggc_rtab; *rt; rt++)
>for (rti = *rt; rti->base != NULL; rti++)
>  for (i = 0; i < rti->nelt; i++)
>if (fread ((char *)rti->base + rti->stride * i,
>   sizeof (void *), 1, f) != 1)
>  fatal_error (input_location, "cannot read PCH file: %m");
> we overwrite the GTY(()) marked global vars including
> extern GTY(()) class line_maps *line_table;
> with pointers into the area we haven't mapped yet (or if the error happens
> after that mmap but before everything is fixed up (e.g. the new relocation
> processing), it is no wonder it doesn't work well.
> 
> Could we save line_table (and perhaps a few other vars) into non-GTY! copies
> of them in ggc-common.c and instead of those fatal_error (input_location, ...)
> calls in gt_pch_restore and ggc_pch_read call fatal_pch_error (...) where
> void
> fatal_pch_error (const char *gmsg)
> {
>  line_table = saved_line_table;
>  // Restore anything else that is needed for fatal_error
>  fatal_error (input_location, gmsg);
> }

That seems reasonable for the case that we call fatal_error from ggc-common, but
I don’t think it will work if fancy_abort is called (for e.g. a segv) - we 
might need to 
make a local fancy_abort() as well for that specific file, perhaps.

Or in some way defer overwriting the data until we’ve succeeded in 
reading/relocating
the whole file (not sure what the largest PCH is we might encounter).

ISTR that we force clear everything before starting the read, since I had 
problems with
phasing diagnostic output when making a previous change to this area, so the 
snapshot
might be needed quite early.

Iain


> 
>   Jakub
>

Re: [PATCH] Improve integer bit test on __atomic_fetch_[or|and]_* returns

2021-11-10 Thread Richard Biener via Gcc-patches

On Wed, Nov 10, 2021 at 6:21 AM liuhongt via Gcc-patches
 wrote:
>
> > >
> > > +#if GIMPLE
> > > +(match (nop_atomic_bit_test_and_p @0 @1)
> > > + (bit_and:c (nop_convert?@4 (ATOMIC_FETCH_OR_XOR_N @2 INTEGER_CST@0 @3))
> > > +   INTEGER_CST@1)
> >
> > no need for the :c on the bit_and when the 2nd operand is an
>
> Changed.
>
> > INTEGER_CST (likewise below)
> >
> > > + (with {
> > > +int ibit = tree_log2 (@0);
> > > +int ibit2 = tree_log2 (@1);
> > > +   }
> > > +  (if (single_use (@4)
> > > +  && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (@4)
> >
> > I wonder whether we should handle both of these in the caller to make
> > this a pure IL structure
> > match?  At your preference.
> >
>
> Changed.
> Add a new parameter to nop_atomic_bit_test_and_p for @4 and test @4 in the 
> caller.
>
> > > +  && ibit == ibit2
> > > +  && ibit >= 0
> > > +
> > > +(match (nop_atomic_bit_test_and_p @0 @1)
> > > + (bit_and:c (nop_convert?@3 (SYNC_FETCH_OR_XOR_N @2 INTEGER_CST@0))
> > > +   INTEGER_CST@1)
> > > + (with {
> > > +int ibit = tree_log2 (@0);
> > > +int ibit2 = tree_log2 (@1);
> > > +   }
> > > +  (if (single_use (@3)
> > > +  && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (@3)
> > > +  && ibit == ibit2
> > > +  && ibit >= 0
> > > +
> > > +(match (nop_atomic_bit_test_and_p @0 @1)
> > > + (bit_and:c
> > > +  (nop_convert?@4
> > > +   (ATOMIC_FETCH_OR_XOR_N @2 (nop_convert? (lshift@0 integer_onep@5 @6)) 
> > > @3))
> > > +  @1)
> > > + (if (single_use (@4)
> > > + && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (@4)
> > > + && operand_equal_p (@0, @1
> >
> > usually for the equality you'd write
> >
> > (ATOMIC_FETCH_OR_XOR_N @2 (nop_convert? (lshift@0 integer_onep@5 @6)) 
> > @3))
> >  @0)
> >
> > thus use @0 in both @0 and @1 places.  Does that not work here?  (the
> > nop_atomic_bit_test_and_p
> > arguments then would be @0 @0).  Likewise below.
> >
>
> It works, changed.
>
> > > +
> > > +(match (nop_atomic_bit_test_and_p @0 @1)
> > > + (bit_and:c
> > > +  (nop_convert?@4
> > > +   (SYNC_FETCH_OR_XOR_N @2 (nop_convert? (lshift@0 integer_onep@3 @5
> > > +  @1)
> > > + (if (single_use (@4)
> > > + && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (@4)
> > > + && operand_equal_p (@0, @1
> > > +
> > > +(match (nop_atomic_bit_test_and_p @0 @1)
> > > + (bit_and:c@4 (nop_convert?@3 (ATOMIC_FETCH_AND_N @2 INTEGER_CST@0 @5))
> > > + INTEGER_CST@1)
> > > + (with {
> > > +tree mask = const_unop (BIT_NOT_EXPR, TREE_TYPE (@0), @0);
> > > +mask = fold_convert (TREE_TYPE (@4), mask);
> >
> > it's prefered to use wide_int for this, so
> >
> >  int ibit = wi::exact_log2 (wi::bit_not (wi::to_wide (@0)));
> >
> > likewise below.
>
> Changed, with a bit adjustment
> int ibit = wi::exact_log2 (wi::zext (wi::bit_not (wi::to_wide (@0)),
> TYPE_PRECISION(type)));
>
> wi::zext is needed when upper bits are all ones after bit_not operation.
> > > +  if (!single_imm_use (use_lhs, &use_p, &use_not_stmt)
> > > +  || !is_gimple_assign (use_not_stmt))
> > > +return nullptr;
> > > +
> > > +  if (gimple_assign_rhs_code (use_not_stmt) != NOP_EXPR)
> >
> >   CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (use_not_stmt))
>
> Changed.
>
> Update patch:

OK.

Thanks,
Richard.

>
> 2021-11-04  H.J. Lu  
> Hongtao Liu  
> gcc/
>
> PR middle-end/102566
> * match.pd (nop_atomic_bit_test_and_p): New match.
> * tree-ssa-ccp.c (convert_atomic_bit_not): New function.
> (gimple_nop_atomic_bit_test_and_p): New prototype.
> (optimize_atomic_bit_test_and): Transform equivalent, but slighly
> different cases to their canonical forms.
>
> gcc/testsuite/
>
> PR middle-end/102566
> * g++.target/i386/pr102566-1.C: New test.
> * g++.target/i386/pr102566-2.C: Likewise.
> * g++.target/i386/pr102566-3.C: Likewise.
> * g++.target/i386/pr102566-4.C: Likewise.
> * g++.target/i386/pr102566-5a.C: Likewise.
> * g++.target/i386/pr102566-5b.C: Likewise.
> * g++.target/i386/pr102566-6a.C: Likewise.
> * g++.target/i386/pr102566-6b.C: Likewise.
> * gcc.target/i386/pr102566-1a.c: Likewise.
> * gcc.target/i386/pr102566-1b.c: Likewise.
> * gcc.target/i386/pr102566-2.c: Likewise.
> * gcc.target/i386/pr102566-3a.c: Likewise.
> * gcc.target/i386/pr102566-3b.c: Likewise.
> * gcc.target/i386/pr102566-4.c: Likewise.
> * gcc.target/i386/pr102566-5.c: Likewise.
> * gcc.target/i386/pr102566-6.c: Likewise.
> * gcc.target/i386/pr102566-7.c: Likewise.
> * gcc.target/i386/pr102566-8a.c: Likewise.
> * gcc.target/i386/pr102566-8b.c: Likewise.
> * gcc.target/i386/pr102566-9a.c: Likewise.
> * gcc.target/i386/pr102566-9b.c: Likewise.
> * gcc.target/i386/pr102566-10a.c: Likewise.
> * gcc.target/i386/p

Re: [PATCH] rs6000: Fix a handful of 32-bit built-in function problems in the new support

2021-11-10 Thread Segher Boessenkool

On Tue, Nov 09, 2021 at 03:46:54PM -0600, Bill Schmidt wrote:
> Hi!  Some time ago I realized I hadn't tested the new builtin support against 
> 32-bit
> big-endian in quite a while.  When I did, I found a handful of errors that 
> needed
> correcting.
>  - One builtin needs to be disabled for 32-bit.
>  - One builtin needs to be restricted to 32-bit only.
>  - One builtin used unsigned long when it needed unsigned long long.
>  - Six builtins used unsigned long long when they needed unsigned long.
>  - One test case needed its expected error message adjusted.
> Otherwise things were fine.

> Bootstrapped and tested on powerpc64le-linux-gnu and powerpc64-linux-gnu with 
> no
> regressions.

{-m32,-m64} for the latter, right?

>   * config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
>   (BPERMD): Flag as 32bit.
>   (UNPACK_TD): Return unsigned long long instead of unsigned long.
>   (SET_TEXASR): Pass unsigned long instead of unsigned long long.
>   (SET_TEXASRU): Likewise.
>   (SET_TFHAR): Likewise.
>   (SET_TFIAR): Likewise.
>   (TABORTDC): Likewise.
>   (TABORTDCI): Likewise.
>   * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
>   handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/cmpb-3.c: Adjust error message.


>const signed long __builtin_bpermd (signed long, signed long);
> -BPERMD bpermd_di {}
> +BPERMD bpermd_di {32bit}

That is not what the old code does?

case POWER7_BUILTIN_BPERMD:
  return rs6000_expand_binop_builtin (((TARGET_64BIT)
   ? CODE_FOR_bpermd_di
   : CODE_FOR_bpermd_si), exp, target);

> -  void __builtin_set_texasr (unsigned long long);
> +  void __builtin_set_texasr (unsigned long);
>  SET_TEXASR nothing {htm,htmspr}
>  
> -  void __builtin_set_texasru (unsigned long long);
> +  void __builtin_set_texasru (unsigned long);
>  SET_TEXASRU nothing {htm,htmspr}
>  
> -  void __builtin_set_tfhar (unsigned long long);
> +  void __builtin_set_tfhar (unsigned long);
>  SET_TFHAR nothing {htm,htmspr}
>  
> -  void __builtin_set_tfiar (unsigned long long);
> +  void __builtin_set_tfiar (unsigned long);
>  SET_TFIAR nothing {htm,htmspr}

This does not seem to be what the exiting code does, either?  Try with
-m32 -mpowerpc64 (it extends to 64 bit there, so the builtin does not
have long int as parameter, it has long long int).

> @@ -15758,6 +15759,8 @@ rs6000_expand_new_builtin (tree exp, rtx target,
>  {
>if (fcode == RS6000_BIF_MFTB)
>   icode = CODE_FOR_rs6000_mftb_si;
> +  else if (fcode == RS6000_BIF_BPERMD)
> + icode = CODE_FOR_bpermd_si;
>else
>   gcc_unreachable ();
>  }

But you disabled it for 32 bit now, so huh.

> --- a/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
> @@ -8,7 +8,7 @@ void abort ();
>  long long int
>  do_compare (long long int a, long long int b)
>  {
> -  return __builtin_cmpb (a, b);  /* { dg-error "'__builtin_cmpb' is not 
> supported in this compiler configuration" } */
> +  return __builtin_cmpb (a, b);  /* { dg-error "'__builtin_p6_cmpb' is 
> not supported in 32-bit mode" } */
>  }

The original spelling is the correct one?


Segher

Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd

2021-11-10 Thread Richard Biener via Gcc-patches

On Tue, Nov 9, 2021 at 5:25 PM Navid Rahimi  wrote:
>
> Hi Richard,
>
> Thank you so much for your detailed feedback. I am attaching another version 
> of the patch which does include all the changes you mentioned.
>
> Bellow you can see my response to your feedbacks:
>
> > the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
> > constants come last - you can then remove the 'c'
> Fixed. I was not aware of the canonical order.
>
> > you can use INTEGRAL_TYPE_P (type).
> Fixed. Didn't know about "type" either.
>
> > this test is not necessary
> Fixed.
>
> > But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
> > it looks like a conflict with the x * (1 + b) <-> x + x * b transform
> > (fold_plusminus_mult)?  That said, the special case of one
> > definitely makes the result cheaper (one less operation).
> For this special case, it does remove operation indeed. But I was not able to 
> reproduce it for any other constant [1]. If it was possible to do it with 
> other constants I would've changed the pattern to have be more general like 
> "x * (C + y/x) - y -> C*x - y % x". Basically anything other than 1 wasn't a 
> win. Regarding the "x * (1 + b) <-> x + x * b" transformation, it appears to 
> me when there is a "- y" at the end "x * (1 + b)", there is opportunity to 
> optimize. Without that "- y" I was not able to make any operation more 
> performant. Either direction, looks like same amount of computation.
>
> 1) https://compiler-explorer.com/z/dWsq7Tzf4
>
> > Please move the pattern next to the most relatest which I think is
> Fixed.
>
> > the return value of 1 is an unreliable way to fail, please instead call
> > __builtin_abort ();
> Fixed.
>
> > do we really need -O3 for this?  I think that -O should be enough here?
> We don't specifically need that. But I realized that the optimization can 
> happen in two different level at the compiler. It seems if you spread the 
> computation over multiple statement like:
>   int c = a/b;
>   int y = b * (1 + c);
>   return y - a;
>
> instead of :
>   return b * (1 + a / b) - a;
>
> Then you have to have at least -O1 to have it optimized. Granted, I am not 
> doing that in the testcase. In the new patch I am changing it to -O. Let me 
> know if you have any suggestions.

-O is fine, generally at -O0 we shouldn't expect such transforms to
happen (but they still do, of course).

The patch looks OK now.

Thanks,
Richard.

>
> Best wishes,
> Navid.
>
> 
> From: Richard Biener 
> Sent: Tuesday, November 9, 2021 02:36
> To: Navid Rahimi
> Cc: gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing 
> pattern to match.pd
>
> On Tue, Nov 9, 2021 at 5:12 AM Navid Rahimi via Gcc-patches
>  wrote:
> >
> > Hi GCC community,
> >
> > This patch will add the missed pattern described in bug 102232 [1] to the 
> > match.pd. The testcase will test whether the multiplication and division 
> > has been removed from the code or not. The correctness proof for this 
> > pattern is here [2] in case anyone is curious.
> >
> > PR tree-optimization/102232
> >   * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization.
>
> +/* x * (1 + y / x) - y -> x - y % x */
> +(simplify
> + (minus (mult:cs @0 (plus:cs integer_onep (trunc_div:s @1 @0))) @1)
>
> the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
> constants come last - you can then remove the 'c'
>
> + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>
> you can use INTEGRAL_TYPE_P (type).
>
> +  && types_match (@0, @1))
>
> this test is not necessary
>
> +  (minus @0 (trunc_mod @1 @0
>
> But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
> it looks like a conflict with the x * (1 + b) <-> x + x * b transform
> (fold_plusminus_mult)?  That said, the special case of one
> definitely makes the result cheaper (one less operation).
>
> Please move the pattern next to the most relatest which I think is
>
> /* X - (X / Y) * Y is the same as X % Y.  */
> (simplify
>  (minus (convert1? @0) (convert2? (mult:c (trunc_div @@0 @@1) @1)))
>  (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
>   (convert (trunc_mod @0 @1
>
> +int
> +main (void)
> +{
> +  // few randomly generated test cases
> +  if (foo (71856034, 238) != 212)
> +{
> +  return 1;
>
> the return value of 1 is an unreliable way to fail, please instead call
> __builtin_abort ();
>
> +/* { dg-options "-O3 -fdump-tree-optimized" } */
>
> do we really need -O3 for this?  I think that -O should be enough here?
>
> Thanks,
> Richard.
>
> >   * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization.
> >
> >
> > 1) 
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D102232&data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cbc89643c6a14411d11ac08d9a36ce282%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720510334697749%7CUnknown%7CTWF

Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different

2021-11-10 Thread Richard Biener via Gcc-patches

On Tue, Nov 9, 2021 at 6:48 PM Qing Zhao  wrote:
>
> So, based on the discussion so far,  is the following patch good to go?

OK.

Thanks,
Richard.

> Let me know if you have more comments on the following patch:
>
> (At the same time, I am testing this patch on both x86 and aarch64)
>
> thanks.
>
> Qing
>
> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> index 0cba95411a6..e8fd16b9c21 100644
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -3059,10 +3059,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>mark_addressable (lhs);
>tree var_addr = build_fold_addr_expr (lhs);
>
> -  tree value = (init_type == AUTO_INIT_PATTERN) ?
> -   build_int_cst (integer_type_node,
> -  INIT_PATTERN_VALUE) :
> -   integer_zero_node;
> +  tree value = (init_type == AUTO_INIT_PATTERN)
> +   ? build_int_cst (integer_type_node,
> +INIT_PATTERN_VALUE)
> +   : integer_zero_node;
>tree m_call = build_call_expr (builtin_decl_implicit (BUILT_IN_MEMSET),
>  3, var_addr, value, var_size);
>/* Expand this memset call.  */
> @@ -3073,15 +3073,17 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>/* If this variable is in a register use expand_assignment.
>  For boolean scalars force zero-init.  */
>tree init;
> +  scalar_int_mode var_mode;
>if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE
>   && tree_fits_uhwi_p (var_size)
>   && (init_type == AUTO_INIT_PATTERN
>   || !is_gimple_reg_type (var_type))
>   && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT,
> -   0).exists ())
> +   0).exists (&var_mode)
> + && have_insn_for (SET, var_mode))
> {
>   unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size);
> - unsigned char *buf = (unsigned char *) xmalloc (total_bytes);
> + unsigned char *buf = XALLOCAVEC (unsigned char, total_bytes);
>   memset (buf, (init_type == AUTO_INIT_PATTERN
> ? INIT_PATTERN_VALUE : 0), total_bytes);
>   tree itype = build_nonstandard_integer_type
> diff --git a/gcc/testsuite/gcc.target/i386/auto-init-6.c 
> b/gcc/testsuite/gcc.target/i386/auto-init-6.c
> index 339f8bc2966..e53385f0eb7 100644
> --- a/gcc/testsuite/gcc.target/i386/auto-init-6.c
> +++ b/gcc/testsuite/gcc.target/i386/auto-init-6.c
> @@ -1,4 +1,6 @@
>  /* Verify pattern initialization for complex type automatic variables.  */
> +/* Note, _Complex long double is initialized to zeroes due to the current
> +   implemenation limitation.  */
>  /* { dg-do compile } */
>  /* { dg-options "-ftrivial-auto-var-init=pattern -march=x86-64 
> -mtune=generic -msse" } */
>
> @@ -15,6 +17,6 @@ _Complex long double foo()
>return result;
>  }
>
> -/* { dg-final { scan-assembler-times "long\t-16843010" 10  { target { ! ia32 
> } } } } */
> -/* { dg-final { scan-assembler-times "long\t-16843010" 6  { target { ia32 } 
> } } } */
> +/* { dg-final { scan-assembler-times "long\t0" 8  { target { ! ia32 } } } } 
> */
> +/* { dg-final { scan-assembler-times "long\t-16843010" 6  } } */
>
>
>
>
> > On Nov 9, 2021, at 4:44 AM, Richard Biener  
> > wrote:
> >
> > On Tue, Nov 9, 2021 at 10:10 AM Jakub Jelinek  wrote:
> >>
> >> On Tue, Nov 09, 2021 at 08:13:57AM +0100, Richard Biener wrote:
>  Hi, I tried both the following patches:
> 
>  Patch1:
> 
>  [opc@qinzhao-ol8u3-x86 gcc]$ git diff
>  diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
>  index 0cba95411a6..ca49d2b4514 100644
>  --- a/gcc/internal-fn.c
>  +++ b/gcc/internal-fn.c
>  @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>    /* If this variable is in a register use expand_assignment.
>  For boolean scalars force zero-init.  */
>    tree init;
>  +  scalar_int_mode var_mode;
>    if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE
>   && tree_fits_uhwi_p (var_size)
>   && (init_type == AUTO_INIT_PATTERN
>   || !is_gimple_reg_type (var_type))
>   && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT,
>  -   0).exists ())
>  +   0).exists (&var_mode)
>  + && targetm.scalar_mode_supported_p (var_mode))
> {
>   unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size);
>   unsigned char *buf = (unsigned char *) xmalloc (total_bytes);
> 
>  AND
> 
>  Patch2:
>  diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
>  index 0cba95411a6..7f129655926 100644
>  --- a/gcc/internal-fn.c
>  +++ b/gcc/internal-fn.c
>  @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gca

[PATCH] rs6000, libgcc: Fix up -Wmissing-prototypes warning on rs6000/linux-unwind.h

2021-11-10 Thread Jakub Jelinek via Gcc-patches

Hi!

Jonathan reported and I've verified a
In file included from ../../../libgcc/unwind-dw2.c:412:
./md-unwind-support.h:398:6: warning: no previous prototype for 
‘ppc_backchain_fallback’ [-Wmissing-prototypes]
  398 | void ppc_backchain_fallback (struct _Unwind_Context *context, void *a)
  |  ^~
warning on powerpc*-linux* libgcc build.

All the other MD_* macro functions are static, so I think the following
is the right thing rather than adding a previous prototype for
ppc_backchain_fallback.

Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (the latter
with -m32/-m64 testing), ok for trunk?

2021-11-09  Jakub Jelinek  

* config/rs6000/linux-unwind.h (ppc_back_fallback): Make it static,
formatting fix.

--- libgcc/config/rs6000/linux-unwind.h.jj  2021-10-15 11:59:16.227682621 
+0200
+++ libgcc/config/rs6000/linux-unwind.h 2021-11-09 11:42:06.840353422 +0100
@@ -395,7 +395,8 @@ struct frame_layout
 };
 
 
-void ppc_backchain_fallback (struct _Unwind_Context *context, void *a)
+static void
+ppc_backchain_fallback (struct _Unwind_Context *context, void *a)
 {
   struct frame_layout *current;
   struct trace_arg *arg = a;

Jakub

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-10 Thread Richard Biener via Gcc-patches

On Tue, Nov 9, 2021 at 5:41 PM Andrew MacLeod  wrote:
>
> On 11/9/21 8:37 AM, Richard Biener wrote:
> > On Mon, Nov 8, 2021 at 8:45 PM Andrew MacLeod  wrote:
> >> On 11/8/21 10:05 AM, Martin Liška wrote:
> >>> On 9/28/21 22:39, Andrew MacLeod wrote:
>  In Theory, modifying the IL should be fine, it happens already in
>  places, but its not extensively tested under those conditions yet.
> >>> Hello Andrew.
> >>>
> >>> I've just tried using a global gimple_ranger and it crashes when loop
> >>> unswitching duplicates
> >>> some BBs.
> >>>
> >>> Please try the attached patch for:
> >> hey Martin,
> >>
> >> try using this in your tree.  Since nothing else is using a growing BB
> >> right now, I'll let you work with it and see if everything works as
> >> expected before checking it in, just in case we need more tweaking.
> >> With this,
> >>
> >> make RUNTESTFLAGS=dg.exp=loop-unswitch*.c check-gcc
> >>
> >> runs clean.
> >>
> >>
> >> basically, I tried to grow it by either a factor of 10% for the current
> >> BB size when the grow is requested, or some double the needed extra
> >> size, or 128... whichever value is "maximum"That means it shoudnt be
> >> asking for tooo much each time, but also not a minimum amount.
> >>
> >> Im certainly open to suggestion on how much to grow it each time.
> >> Note the vector being grown is ONLY fo the SSA_NAme being asked for.. so
> >> it really an on-demand thing just for specific names, in your case,
> >> mostly just the switch index.
> >>
> >> Let me know how this works for you, and if you have any other issues.
> > So I think in the end we shouldn't need the growing.  Ideally we'd do all
> > the analysis before the first transform, but for that we'd need ranger to
> > be able to "simplify" conditions based on a known true/false predicate
> > that's not yet in the IL.  Consider
> >
> >   for (;;)
> > {
> >  if (invariant < 3) // A
> >{
> > ...
> >}
> >  if (invariant < 5) // B
> >{
> > ...
> >}
> > }
> >
> > unswitch analysis will run into the condition 'A' and determine the loop
> > can be unswitched with the condition 'invariant < 3'.  To be able to
> > perform cost assessment and to avoid redundant unswitching we
> > want to determine that if we unswitch with 'invariant < 3' being
> > true then the condition at 'B' is true as well before actually inserting
> > the if (invariant < 3) outside of the loop.
> >
> > So I'm thinking of assigning a gimple_uid to each condition we want to
> > unswitch on and have an array indexed by the uid with meta-data on
> > the unswitch opportunity, the "related" conditions could be marked with
> > the same uid (or some other), and the folding result recorded so that
> > at transform time we can just do the appropriate replacement without
> > invoking ranger again.
> >
> > Now, but how do we arrange for the ranger analysis here?
>
> well, i think there are multiple ways we could do this.are you
> always doing this on
>
>if (invariant < constant) or might it be another ssa-name?

It can be any other SSA name, including local (but also invariant)
defs that need to be moved.

> because if its always a  constant, you could ask for the range of
> invariant at  if (invariant < 3) when you unswitch it and it will
> provide you with [MIN, 2].
>
> when you look at  if (invariant < 5), you can try folding that stmt
> using the range you know already from above..   theres an API to
> fold_stmt() independent from ranger (in gimple-range-fold.h) which lets
> you supply ranges for ssa_names in the order they are found in the stmt,
>
>   bool fold_range (irange &r, gimple *s, irange &r1);
>
> so putting it together, you can do something like:
>
> // decide to unswitch this, as for the range of invariant on the TRUE edge:
> s1 = first_stmt   :  if (invariant < 3)
> range_of_expr (&ivrange, invariant, TRUE_EDGE)  //  This will set
> ivrange to [MIN, 2].. its value on the TRUE edge
>
> // Now we come to the second if, we try to fold it using the range from
> the first stmt.   if fold_stmt returns true, it mean stmt2_range will
> have the result of folding the stmt. only one range is supplied, so it
> will apply ivrange [MIN, 2] to the first ssa-name it encounters in the
> stmt, and [MIN, 2] < 5  so it will return bool [1,1] for the range of
> the stmt.
>
> s2 = second_stmt  : if (invariant < 5)
> if (fold_range (&stmt2_range, second_stmt, ivrange) &&
> stmt2_range.singleton_p ()
>{
>if (!stmt2_range.zero_p ())
>   // result is not zero, so that means this stmt will always
> be true given the range in ivrange substituted for "invariant",
>
> There is a fair amount of flexibility on exactly what you can do,
> depending on how complex you want to get.

So in some sense I want to evaluate general predicates, a more
complex example might be that we unswitch on

   if (pred1)

and later we see

   tem = pred1 & pred2;
   if (tem)

where for e

Re: [PATCH] rs6000/doc: Rename future cpu with power10

2021-11-10 Thread Segher Boessenkool

Hi!

On Wed, Nov 10, 2021 at 01:41:25PM +0800, Kewen.Lin wrote:
> Commmit 5d9d0c94588 renamed future to power10 and ace60939fd2
> updated the documentation for "future" renaming.  This patch
> is to rename the remaining "future architecture" references in
> documentation.

Good find :-)

> @@ -28613,7 +28613,7 @@ the offset with a symbol reference to a canary in the 
> TLS block.
>  @opindex mpcrel
>  @opindex mno-pcrel
>  Generate (do not generate) pc-relative addressing when the option
> -@option{-mcpu=future} is used.  The @option{-mpcrel} option requires
> +@option{-mcpu=power10} is used.  The @option{-mpcrel} option requires
>  that the medium code model (@option{-mcmodel=medium}) and prefixed
>  addressing (@option{-mprefixed}) options are enabled.

It still sounds strange, and factually incorrect really: the -mpcrel
option says to use pc-relative processing, no matter if -mcpu=power10 is
used or not.  For example, it will work fine with later CPUs as well.

So maybe this should just delete from after "addressing" to the end of
that line?  It already says what the prerequisites are, on the very next
line :-)

Segher

Re: [PATCH] rs6000, libgcc: Fix up -Wmissing-prototypes warning on rs6000/linux-unwind.h

2021-11-10 Thread Segher Boessenkool

On Wed, Nov 10, 2021 at 09:39:12AM +0100, Jakub Jelinek wrote:
> Hi!
> 
> Jonathan reported and I've verified a
> In file included from ../../../libgcc/unwind-dw2.c:412:
> ./md-unwind-support.h:398:6: warning: no previous prototype for 
> ‘ppc_backchain_fallback’ [-Wmissing-prototypes]
>   398 | void ppc_backchain_fallback (struct _Unwind_Context *context, void *a)
>   |  ^~
> warning on powerpc*-linux* libgcc build.
> 
> All the other MD_* macro functions are static, so I think the following
> is the right thing rather than adding a previous prototype for
> ppc_backchain_fallback.
> 
> Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (the latter
> with -m32/-m64 testing), ok for trunk?

Yes please.  Thanks!


Segher


> 2021-11-09  Jakub Jelinek  
> 
>   * config/rs6000/linux-unwind.h (ppc_back_fallback): Make it static,
>   formatting fix.

[Ada] Better error message on missing parentheses

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

Adapt the test to issue a different error message when it is likely that
an if-expression is suspected, but parentheses are missing. This makes
the test more in line with its comment.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* par-ch4.adb (P_Primary): Adapt test for getting error message
on missing parentheses.diff --git a/gcc/ada/par-ch4.adb b/gcc/ada/par-ch4.adb
--- a/gcc/ada/par-ch4.adb
+++ b/gcc/ada/par-ch4.adb
@@ -2892,8 +2892,10 @@ package body Ch4 is
if Token_Is_At_Start_Of_Line
  and then not
(Ada_Version >= Ada_2012
- and then Style_Check_Indentation /= 0
- and then Start_Column rem Style_Check_Indentation /= 0)
+  and then
+(Style_Check_Indentation = 0
+   or else
+ Start_Column rem Style_Check_Indentation /= 0))
then
   Error_Msg_AP ("missing operand");
   return Error;

[Ada] Create explicit ghost mirror unit for big integers

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

So far, only one light runtime units was using the standard unit for
big integers. A special ghost mirror had been created for the reduced
runtimes. In order to use more liberally big integers for proof of the
runtime, rename this ghost mirror into Big_Integers_Ghost at the same
level in the hierarchy of units as Big_Integers.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* Makefile.rtl: Add unit.
* libgnat/a-nbnbin__ghost.adb: Move...
* libgnat/a-nbnbig.adb: ... here. Mark ghost as ignored.
* libgnat/a-nbnbin__ghost.ads: Move...
* libgnat/a-nbnbig.ads: ... here.  Add comment for purpose of
this unit. Mark ghost as ignored.
* libgnat/s-widthu.adb: Use new unit.
* sem_aux.adb (First_Subtype): Adapt to the case of a ghost type
whose freeze node is rewritten to a null statement.diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -211,6 +211,7 @@ GNATRTL_NONTASKING_OBJS= \
   a-lllwti$(objext) \
   a-lllzti$(objext) \
   a-locale$(objext) \
+  a-nbnbig$(objext) \
   a-nbnbin$(objext) \
   a-nbnbre$(objext) \
   a-ncelfu$(objext) \


diff --git a/gcc/ada/libgnat/a-nbnbin__ghost.adb b/gcc/ada/libgnat/a-nbnbig.adb
--- a/gcc/ada/libgnat/a-nbnbin__ghost.adb
+++ b/gcc/ada/libgnat/a-nbnbig.adb
@@ -2,7 +2,7 @@
 --  --
 -- GNAT RUN-TIME COMPONENTS --
 --  --
---  ADA.NUMERICS.BIG_NUMBERS.BIG_INTEGERS   --
+--   ADA.NUMERICS.BIG_NUMBERS.BIG_INTEGERS_GHOST--
 --  --
 -- B o d y  --
 --  --
@@ -33,7 +33,12 @@
 --  currently does not compile instantiations of the spec with imported ghost
 --  generics for packages Signed_Conversions and Unsigned_Conversions.
 
-package body Ada.Numerics.Big_Numbers.Big_Integers with
+--  Ghost code in this unit is meant for analysis only, not for run-time
+--  checking. This is enforced by setting the assertion policy to Ignore.
+
+pragma Assertion_Policy (Ghost => Ignore);
+
+package body Ada.Numerics.Big_Numbers.Big_Integers_Ghost with
SPARK_Mode => Off
 is
 
@@ -73,4 +78,4 @@ is
 
end Unsigned_Conversions;
 
-end Ada.Numerics.Big_Numbers.Big_Integers;
+end Ada.Numerics.Big_Numbers.Big_Integers_Ghost;


diff --git a/gcc/ada/libgnat/a-nbnbin__ghost.ads b/gcc/ada/libgnat/a-nbnbig.ads
--- a/gcc/ada/libgnat/a-nbnbin__ghost.ads
+++ b/gcc/ada/libgnat/a-nbnbig.ads
@@ -2,7 +2,7 @@
 --  --
 -- GNAT RUN-TIME COMPONENTS --
 --  --
---  ADA.NUMERICS.BIG_NUMBERS.BIG_INTEGERS   --
+--   ADA.NUMERICS.BIG_NUMBERS.BIG_INTEGERS_GHOST--
 --  --
 -- S p e c  --
 --  --
@@ -13,7 +13,21 @@
 --  --
 --
 
-package Ada.Numerics.Big_Numbers.Big_Integers with
+--  This unit is provided as a replacement for the standard unit
+--  Ada.Numerics.Big_Numbers.Big_Integers when only proof with SPARK is
+--  intended. It cannot be used for execution, as all subprograms are marked
+--  imported with no definition.
+
+--  Contrary to Ada.Numerics.Big_Numbers.Big_Integers, this unit does not
+--  depend on System or Ada.Finalization, which makes it more convenient for
+--  use in run-time units.
+
+--  Ghost code in this unit is meant for analysis only, not for run-time
+--  checking. This is enforced by setting the assertion policy to Ignore.
+
+pragma Assertion_Policy (Ghost => Ignore);
+
+package Ada.Numerics.Big_Numbers.Big_Integers_Ghost with
SPARK_Mode,
Ghost,
Preelaborate
@@ -199,4 +213,4 @@ private
 
type Big_Integer is null record;
 
-end Ada.Numerics.Big_Numbers.Big_Integers;
+end Ada.Numerics.Big_Numbers.Big_Integers_Ghost;


diff --git a/gcc/ada/libgnat/s-widthu.adb b/gcc/ada/libgnat/s-widthu.adb
--- a/gcc/ada/libgnat/s-widthu.adb
+++ b/gcc/ada/libgnat/s-widthu.adb
@@ -29,8 +29,8 @@
 --  --
 --
 
-with Ada.Numerics.Big_Numbers.Big_Integers;
-use Ada.

[Ada] Warn when interfaces swapped between full and partial view

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

The following package declaration is legal but the declaration of D
leads to performing a tree transformation.  Defining D as `type D is new
B and A with null record` would be consistent with the partial view and
thus does not require any transformation.

This is helpful in the case of generic packages where we fail to
correctly transform the tree.

package E is
   type A is interface;
   type B is interface and A;
   type D is new B with private;
private
   type D is new A and B with null record;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch3.adb (Derived_Type_Declaration): Introduce a subprogram
for tree transformation. If a tree transformation is performed,
then warn that it would be better to reorder the interfaces.diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -17258,10 +17258,46 @@ package body Sem_Ch3 is
 and then Is_Interface (Parent_Type)
   then
  declare
-Iface   : Node_Id;
 Partial_View: Entity_Id;
 Partial_View_Parent : Entity_Id;
-New_Iface   : Node_Id;
+
+function Reorder_Interfaces return Boolean;
+--  Look for an interface in the full view's interface list that
+--  matches the parent type of the partial view, and when found,
+--  rewrite the full view's parent with the partial view's parent,
+--  append the full view's original parent to the interface list,
+--  recursively call Derived_Type_Definition on the full type, and
+--  return True. If a match is not found, return False.
+--  ??? This seems broken in the case of generic packages.
+
+
+-- Reorder_Interfaces --
+
+
+function Reorder_Interfaces return Boolean is
+   Iface : Node_Id;
+   New_Iface : Node_Id;
+begin
+   Iface := First (Interface_List (Def));
+   while Present (Iface) loop
+  if Etype (Iface) = Etype (Partial_View) then
+ Rewrite (Subtype_Indication (Def),
+   New_Copy (Subtype_Indication (Parent (Partial_View;
+
+ New_Iface :=
+   Make_Identifier (Sloc (N), Chars (Parent_Type));
+ Append (New_Iface, Interface_List (Def));
+
+ --  Analyze the transformed code
+
+ Derived_Type_Declaration (T, N, Is_Completion);
+ return True;
+  end if;
+
+  Next (Iface);
+   end loop;
+   return False;
+end Reorder_Interfaces;
 
  begin
 --  Look for the associated private type declaration
@@ -17282,30 +17318,26 @@ package body Sem_Ch3 is
then
   null;
 
-   --  Traverse the list of interfaces of the full-view to look
-   --  for the parent of the partial-view and perform the tree
-   --  transformation.
+   --  Traverse the list of interfaces of the full view to look
+   --  for the parent of the partial view and reorder the
+   --  interfaces to match the order in the partial view,
+   --  if needed.
 
else
-  Iface := First (Interface_List (Def));
-  while Present (Iface) loop
- if Etype (Iface) = Etype (Partial_View) then
-Rewrite (Subtype_Indication (Def),
-  New_Copy (Subtype_Indication
- (Parent (Partial_View;
-
-New_Iface :=
-  Make_Identifier (Sloc (N), Chars (Parent_Type));
-Append (New_Iface, Interface_List (Def));
 
---  Analyze the transformed code
+  if Reorder_Interfaces then
+ --  Having the interfaces listed in any order is legal.
+ --  However, the compiler does not properly handle
+ --  different orders between partial and full views in
+ --  generic units. We give a warning about the order
+ --  mismatch, so the user can work around this problem.
 
-Derived_Type_Declaration (T, N, Is_Completion);
-return;
- end if;
+ Error_Msg_N ("??full declaration does not respect " &
+  "partial declaration order", T);
+ Error_Msg_N ("\??consider reordering", T);
 
- Next (Iface);
-  end loop;
+ return;
+  end if;

[Ada] Extend optimized equality of 2-element arrays

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

Array equality is typically expanded into a loop, but for small arrays
such loops are inefficient (and the code generator might fail to turn
them into linear code, especially when the array contains records).

We optimize equality of 2-element arrays into an AND THEN expression,
but only for array types whose bounds are given by a range expression.
Now we do this for all 2-element arrays with compile-time known bounds,
regardless of how their bounds are given, e.g. for array types declared
like:

   type A1 is array (Integer range 1 .. 2) of ...;
   type A2 is array (Boolean) of ...;

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Expand_Array_Equality): Remove check of the array
bound being an N_Range node; use Type_High_Bound/Type_Low_Bound,
which handle all kinds of array bounds.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -1988,14 +1988,16 @@ package body Exp_Ch4 is
 and then Ltyp = Rtyp
 and then Is_Constrained (Ltyp)
 and then Number_Dimensions (Ltyp) = 1
-and then Nkind (First_Idx) = N_Range
-and then Compile_Time_Known_Value (Low_Bound (First_Idx))
-and then Compile_Time_Known_Value (High_Bound (First_Idx))
-and then Expr_Value (High_Bound (First_Idx)) =
- Expr_Value (Low_Bound (First_Idx)) + 1
+and then Compile_Time_Known_Bounds (Ltyp)
+and then Expr_Value (Type_High_Bound (Etype (First_Idx))) =
+   Expr_Value (Type_Low_Bound (Etype (First_Idx))) + 1
   then
  declare
 Ctyp : constant Entity_Id := Component_Type (Ltyp);
+Low_B: constant Node_Id :=
+  Type_Low_Bound (Etype (First_Idx));
+High_B   : constant Node_Id :=
+  Type_High_Bound (Etype (First_Idx));
 L, R : Node_Id;
 TestL, TestH : Node_Id;
 
@@ -2003,28 +2005,24 @@ package body Exp_Ch4 is
 L :=
   Make_Indexed_Component (Loc,
 Prefix  => New_Copy_Tree (New_Lhs),
-Expressions =>
-  New_List (New_Copy_Tree (Low_Bound (First_Idx;
+Expressions => New_List (New_Copy_Tree (Low_B)));
 
 R :=
   Make_Indexed_Component (Loc,
 Prefix  => New_Copy_Tree (New_Rhs),
-Expressions =>
-  New_List (New_Copy_Tree (Low_Bound (First_Idx;
+Expressions => New_List (New_Copy_Tree (Low_B)));
 
 TestL := Expand_Composite_Equality (Nod, Ctyp, L, R, Bodies);
 
 L :=
   Make_Indexed_Component (Loc,
 Prefix  => New_Lhs,
-Expressions =>
-  New_List (New_Copy_Tree (High_Bound (First_Idx;
+Expressions => New_List (New_Copy_Tree (High_B)));
 
 R :=
   Make_Indexed_Component (Loc,
 Prefix  => New_Rhs,
-Expressions =>
-  New_List (New_Copy_Tree (High_Bound (First_Idx;
+Expressions => New_List (New_Copy_Tree (High_B)));
 
 TestH := Expand_Composite_Equality (Nod, Ctyp, L, R, Bodies);

[Ada] Fix Constraint error on rexgexp close bracket find algorithm

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

In pattern syntax checking, make a procedure out of the algorithm to
find the close bracket matching an open bracket. Fix cases where the
close bracket is missing in the special cases '-' and '\', e.g.:

- "[a-b"
- "[\b"
- "[\]" misses either a backslash or a close bracket

These three cases would raise constraint errors.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-regexp.adb (Check_Well_Formed_Pattern): Fix
Constraint_Error on missing close bracket.diff --git a/gcc/ada/libgnat/s-regexp.adb b/gcc/ada/libgnat/s-regexp.adb
--- a/gcc/ada/libgnat/s-regexp.adb
+++ b/gcc/ada/libgnat/s-regexp.adb
@@ -122,7 +122,7 @@ package body System.Regexp is
is
   S : String := Pattern;
   --  The pattern which is really compiled (when the pattern is case
-  --  insensitive, we convert this string to lower-cases
+  --  insensitive, we convert this string to lower-cases).
 
   Map : Mapping := (others => 0);
   --  Mapping between characters and columns in the tables
@@ -209,8 +209,59 @@ package body System.Regexp is
  --  The last occurrence of an opening parenthesis, if Glob=False,
  --  or the last occurrence of an opening curly brace, if Glob=True.
 
+ procedure Find_Close_Bracket;
+ --  Go through the pattern to find a closing bracket. Raise an
+ --  exception if none is found.
+
  procedure Raise_Exception_If_No_More_Chars (K : Integer := 0);
- --  If S(J + 1 .. S'Last)'Length < K then call Raise_Exception
+ --  If J + K > S'Last then call Raise_Exception
+
+ 
+ -- Find_Close_Bracket --
+ 
+
+ procedure Find_Close_Bracket is
+Possible_Range_Start : Boolean := True;
+--  Set True everywhere a range character '-' can occur
+
+ begin
+loop
+   exit when S (J) = Close_Bracket;
+
+   Raise_Exception_If_No_More_Chars (1);
+   --  The current character is not a close_bracket, thus it should
+   --  be followed by at least one more char. If not, no close
+   --  bracket is present and the pattern is ill-formed.
+
+   if S (J) = '-' and then S (J + 1) /= Close_Bracket then
+  if not Possible_Range_Start then
+ Raise_Exception
+("No mix of ranges is allowed in "
+& "regular expression", J);
+  end if;
+
+  J := J + 1;
+  Raise_Exception_If_No_More_Chars (1);
+
+  Possible_Range_Start := False;
+  --  Range cannot be followed by '-' character,
+  --  except as last character in the set.
+
+   else
+  Possible_Range_Start := True;
+   end if;
+
+   if S (J) = '\' then
+  J := J + 1;
+  Raise_Exception_If_No_More_Chars (1);
+  --  We ignore the next character and need to check we have
+  --  one more available character. This is necessary for
+  --  the erroneous [\] pattern which stands for [\]] or [\\].
+   end if;
+
+   J := J + 1;
+end loop;
+ end Find_Close_Bracket;
 
  --
  -- Raise_Exception_If_No_More_Chars --
@@ -240,63 +291,23 @@ package body System.Regexp is
  end if;
   end if;
 
-  --  The first character never has a special meaning
-
+  --  Characters ']' and '-' are meant as literals when first
+  --  in the list.  As such, they have no special meaning and
+  --  we pass them.
   if S (J) = ']' or else S (J) = '-' then
  J := J + 1;
  Raise_Exception_If_No_More_Chars;
   end if;
 
-  --  The set of characters cannot be empty
-
   if S (J) = ']' then
+ --  ??? This message is misleading since the check forbids
+ --  the sets []] and [-] but not the empty set [].
  Raise_Exception
("Set of characters cannot be empty in regular "
   & "expression", J);
   end if;
 
-  declare
- Possible_Range_Start : Boolean := True;
- --  Set True everywhere a range character '-' can occur
-
-  begin
- loop
-exit when S (J) = Close_Bracket;
-
---  The current character should be followed by a
---  closing bracket.
-
-Raise_Exception_If_No_More_Chars (1);
-
-if S (J) = '-'
-

[Ada] Fix oversight in latest change to Has_Compatible_Type

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

Adding manual calls to Covers in the callers overlooks the overloaded case,
so this follow-up change adds back the reversed calls to Has_Compatible_Type
but guard them with a boolean flag set to true for comparison operators.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_type.ads (Has_Compatible_Type): Add For_Comparison parameter.
* sem_type.adb (Has_Compatible_Type): Put back the reversed calls
to Covers guarded with For_Comparison.
* sem_ch4.adb (Analyze_Membership_Op) : Remove new
reversed call to Covers and set For_Comparison to true instead.
(Find_Comparison_Types) : Likewise
(Find_Equality_Types) : Likewise.diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -3113,7 +3113,7 @@ package body Sem_Ch4 is
 
   procedure Try_One_Interp (T1 : Entity_Id) is
   begin
- if Has_Compatible_Type (R, T1) or else Covers (Etype (R), T1) then
+ if Has_Compatible_Type (R, T1, For_Comparison => True) then
 if Found
   and then Base_Type (T1) /= Base_Type (T_F)
 then
@@ -6607,8 +6607,7 @@ package body Sem_Ch4 is
  end if;
 
  if Valid_Comparison_Arg (T1)
-   and then (Has_Compatible_Type (R, T1)
-  or else Covers (Etype (R), T1))
+   and then Has_Compatible_Type (R, T1, For_Comparison => True)
  then
 if Found and then Base_Type (T1) /= Base_Type (T_F) then
It := Disambiguate (L, I_F, Index, Any_Type);
@@ -7105,8 +7104,8 @@ package body Sem_Ch4 is
 
  if T1 /= Standard_Void_Type
and then (Universal_Access
-  or else Has_Compatible_Type (R, T1)
-  or else Covers (Etype (R), T1))
+  or else
+ Has_Compatible_Type (R, T1, For_Comparison => True))
 
and then
  ((not Is_Limited_Type (T1)


diff --git a/gcc/ada/sem_type.adb b/gcc/ada/sem_type.adb
--- a/gcc/ada/sem_type.adb
+++ b/gcc/ada/sem_type.adb
@@ -2438,8 +2438,9 @@ package body Sem_Type is
-
 
function Has_Compatible_Type
- (N   : Node_Id;
-  Typ : Entity_Id) return Boolean
+ (N  : Node_Id;
+  Typ: Entity_Id;
+  For_Comparison : Boolean := False) return Boolean
is
   I  : Interp_Index;
   It : Interp;
@@ -2479,6 +2480,12 @@ package body Sem_Type is
or else
  (Nkind (N) = N_String_Literal
and then Present (Find_Aspect (Typ, Aspect_String_Literal)))
+
+   or else
+ (For_Comparison
+   and then not Is_Tagged_Type (Typ)
+   and then Ekind (Typ) /= E_Anonymous_Access_Type
+   and then Covers (Etype (N), Typ))
  then
 return True;
  end if;
@@ -2503,6 +2510,11 @@ package body Sem_Type is
   and then Covers (Typ, Corresponding_Record_Type
  (Etype (It.Typ
 
+ or else
+   (For_Comparison
+ and then not Is_Tagged_Type (Typ)
+ and then Ekind (Typ) /= E_Anonymous_Access_Type
+ and then Covers (It.Typ, Typ))
 then
return True;
 end if;


diff --git a/gcc/ada/sem_type.ads b/gcc/ada/sem_type.ads
--- a/gcc/ada/sem_type.ads
+++ b/gcc/ada/sem_type.ads
@@ -186,11 +186,17 @@ package Sem_Type is
--  right operand, which has one interpretation compatible with that of L.
--  Return the type intersection of the two.
 
-   function Has_Compatible_Type (N : Node_Id; Typ : Entity_Id) return Boolean;
+   function Has_Compatible_Type
+ (N  : Node_Id;
+  Typ: Entity_Id;
+  For_Comparison : Boolean := False) return Boolean;
--  Verify that some interpretation of the node N has a type compatible with
--  Typ. If N is not overloaded, then its unique type must be compatible
--  with Typ. Otherwise iterate through the interpretations of N looking for
-   --  a compatible one.
+   --  a compatible one. If For_Comparison is true, the function is invoked for
+   --  a comparison (or equality) operator and also needs to verify the reverse
+   --  compatibility, because the implementation of type resolution for these
+   --  operators is not fully symmetrical.
 
function Hides_Op (F : Entity_Id; Op : Entity_Id) return Boolean;
--  A user-defined function hides a predefined operator if it matches the

[Ada] Use predefined equality for arrays inside records

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

The equality of arrays inside records always applies the predefined
equality, just like for elementary types. In Expand_Composite_Equality
we had some dedicated code for arrays inside records which was failing
to duplicate a similar code in Expand_N_Op_Eq, e.g. it was failing to
apply validity checks.

This patch removes this dedicated and unnecessarily duplicated code.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Expand_Composite_Equality): Handle arrays inside
records just like scalars; only records inside records need
dedicated handling.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -2475,75 +2475,9 @@ package body Exp_Ch4 is
  Full_Type := Underlying_Type (Full_Type);
   end if;
 
-  --  Case of array types
-
-  if Is_Array_Type (Full_Type) then
-
- --  If the operand is an elementary type other than a floating-point
- --  type, then we can simply use the built-in block bitwise equality,
- --  since the predefined equality operators always apply and bitwise
- --  equality is fine for all these cases.
-
- if Is_Elementary_Type (Component_Type (Full_Type))
-   and then not Is_Floating_Point_Type (Component_Type (Full_Type))
- then
-return Make_Op_Eq (Loc, Left_Opnd => Lhs, Right_Opnd => Rhs);
-
- --  For composite component types, and floating-point types, use the
- --  expansion. This deals with tagged component types (where we use
- --  the applicable equality routine) and floating-point (where we
- --  need to worry about negative zeroes), and also the case of any
- --  composite type recursively containing such fields.
-
- else
-declare
-   Comp_Typ : Entity_Id;
-   Hi   : Node_Id;
-   Indx : Node_Id;
-   Ityp : Entity_Id;
-   Lo   : Node_Id;
-
-begin
-   --  Do the comparison in the type (or its full view) and not in
-   --  its unconstrained base type, because the latter operation is
-   --  more complex and would also require an unchecked conversion.
-
-   if Is_Private_Type (Typ) then
-  Comp_Typ := Underlying_Type (Typ);
-   else
-  Comp_Typ := Typ;
-   end if;
-
-   --  Except for the case where the bounds of the type depend on a
-   --  discriminant, or else we would run into scoping issues.
-
-   Indx := First_Index (Comp_Typ);
-   while Present (Indx) loop
-  Ityp := Etype (Indx);
-
-  Lo := Type_Low_Bound (Ityp);
-  Hi := Type_High_Bound (Ityp);
-
-  if (Nkind (Lo) = N_Identifier
-   and then Ekind (Entity (Lo)) = E_Discriminant)
-or else
- (Nkind (Hi) = N_Identifier
-   and then Ekind (Entity (Hi)) = E_Discriminant)
-  then
- Comp_Typ := Full_Type;
- exit;
-  end if;
-
-  Next_Index (Indx);
-   end loop;
-
-   return Expand_Array_Equality (Nod, Lhs, Rhs, Bodies, Comp_Typ);
-end;
- end if;
-
   --  Case of tagged record types
 
-  elsif Is_Tagged_Type (Full_Type) then
+  if Is_Tagged_Type (Full_Type) then
  Eq_Op := Find_Primitive_Eq (Typ);
  pragma Assert (Present (Eq_Op));
 
@@ -2734,7 +2668,7 @@ package body Exp_Ch4 is
 return Expand_Record_Equality (Nod, Full_Type, Lhs, Rhs, Bodies);
  end if;
 
-  --  Non-composite types (always use predefined equality)
+  --  Case of non-record types (always use predefined equality)
 
   else
  return Make_Op_Eq (Loc, Left_Opnd => Lhs, Right_Opnd => Rhs);

[Ada] Don't carry action bodies for expansion of array equality

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

Expansion of array equality creates a function, which needs to be
inserted into the AST. The insertion point was carried from
Expand_N_Op_Eq to Expand_Record_Equality and Expand_Array_Equality,
which were mutually recursive (via Expand_Composite_Equality). Now these
routines are no longer recursive, so there is no need to carry the
insertion point between them.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch3.adb (Make_Eq_Body): Adapt call to
Expand_Record_Equality.
* exp_ch4.ads, exp_ch4.adb (Expand_Composite_Equality): Remove
Bodies parameter; adapt comment; fix style in body; adapt calls
to Expand_Record_Equality.
(Expand_Array_Equality): Adapt calls to
Expand_Composite_Equality.
(Expand_Record_Equality): Remove Bodies parameter; adapt
comment; adapt call to Expand_Composite_Equality.
* exp_ch8.adb (Build_Body_For_Renaming): Adapt call to
Expand_Record_Equality.diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -9864,10 +9864,9 @@ package body Exp_Ch3 is
  Expression =>
Expand_Record_Equality
  (Typ,
-  Typ=> Typ,
-  Lhs=> Make_Identifier (Loc, Name_X),
-  Rhs=> Make_Identifier (Loc, Name_Y),
-  Bodies => Declarations (Decl;
+  Typ => Typ,
+  Lhs => Make_Identifier (Loc, Name_X),
+  Rhs => Make_Identifier (Loc, Name_Y;
   end if;
 
   Set_Handled_Statement_Sequence


diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -146,18 +146,14 @@ package body Exp_Ch4 is
--  where we allow comparison of "out of range" values.
 
function Expand_Composite_Equality
- (Nod: Node_Id;
-  Typ: Entity_Id;
-  Lhs: Node_Id;
-  Rhs: Node_Id;
-  Bodies : List_Id) return Node_Id;
+ (Nod : Node_Id;
+  Typ : Entity_Id;
+  Lhs : Node_Id;
+  Rhs : Node_Id) return Node_Id;
--  Local recursive function used to expand equality for nested composite
-   --  types. Used by Expand_Record/Array_Equality, Bodies is a list on which
-   --  to attach bodies of local functions that are created in the process. It
-   --  is the responsibility of the caller to insert those bodies at the right
-   --  place. Nod provides the Sloc value for generated code. Lhs and Rhs are
-   --  the left and right sides for the comparison, and Typ is the type of the
-   --  objects to compare.
+   --  types. Used by Expand_Record/Array_Equality. Nod provides the Sloc value
+   --  for generated code. Lhs and Rhs are the left and right sides for the
+   --  comparison, and Typ is the type of the objects to compare.
 
procedure Expand_Concatenate (Cnode : Node_Id; Opnds : List_Id);
--  Routine to expand concatenation of a sequence of two or more operands
@@ -1722,8 +1718,7 @@ package body Exp_Ch4 is
  Prefix  => Make_Identifier (Loc, Chars (B)),
  Expressions => Index_List2);
 
- Test := Expand_Composite_Equality
-   (Nod, Component_Type (Typ), L, R, Decls);
+ Test := Expand_Composite_Equality (Nod, Component_Type (Typ), L, R);
 
  --  If some (sub)component is an unchecked_union, the whole operation
  --  will raise program error.
@@ -2012,7 +2007,7 @@ package body Exp_Ch4 is
 Prefix  => New_Copy_Tree (New_Rhs),
 Expressions => New_List (New_Copy_Tree (Low_B)));
 
-TestL := Expand_Composite_Equality (Nod, Ctyp, L, R, Bodies);
+TestL := Expand_Composite_Equality (Nod, Ctyp, L, R);
 
 L :=
   Make_Indexed_Component (Loc,
@@ -2024,7 +2019,7 @@ package body Exp_Ch4 is
 Prefix  => New_Rhs,
 Expressions => New_List (New_Copy_Tree (High_B)));
 
-TestH := Expand_Composite_Equality (Nod, Ctyp, L, R, Bodies);
+TestH := Expand_Composite_Equality (Nod, Ctyp, L, R);
 
 return
   Make_And_Then (Loc, Left_Opnd => TestL, Right_Opnd => TestH);
@@ -2437,18 +2432,15 @@ package body Exp_Ch4 is
--  case because it is not possible to respect normal Ada visibility rules.
 
function Expand_Composite_Equality
- (Nod: Node_Id;
-  Typ: Entity_Id;
-  Lhs: Node_Id;
-  Rhs: Node_Id;
-  Bodies : List_Id) return Node_Id
+ (Nod : Node_Id;
+  Typ : Entity_Id;
+  Lhs : Node_Id;
+  Rhs : Node_Id) return Node_Id
is
   Loc   : constant Source_Ptr := Sloc (Nod);
   Full_Type : Entity_Id;
   Eq_Op : Entity_Id;
 
-   --  Start of processing for Expand_Composite_Equality
-
begin
   if Is_Private_Type (Typ) then
  Full_Type := Underlying_Type (Typ);
@@ -2665,7 +2657,7 @@ package body Exp_Ch4 i

[Ada] Prove double precision integer arithmetic unit

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

This unit is used to implement the rutime support for fixed-point
operations (conversions, multiplication, division and I/O).  Its
correctness is proved with GNATprove.

Proof is performed with GNATprove options --level=4 --prover=all

Proof requires use of the special Big_Integers_Ghost unit for spec and
proof. Mark also the units analyzed as being in SPARK with aspect
SPARK_Mode.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-nbnbig.ads: Mark the unit as Pure.
* libgnat/s-aridou.adb: Add contracts and ghost code for proof.
(Scaled_Divide): Reorder operations and use of temporaries in
two places to facilitate proof.
* libgnat/s-aridou.ads: Add full functional contracts.
* libgnat/s-arit64.adb: Mark in SPARK.
* libgnat/s-arit64.ads: Add contracts similar to those from
s-aridou.ads.
* rtsfind.ads: Document the limitation that runtime units
loading does not work for private with-clauses.

patch.diff.gz
Description: application/gzip

[Ada] Do not assume a priority value of zero is a valid priority

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

While a priority value of zero is typically valid on most systems, there
are some targets where zero may be reserved for OS purposes (for
example: RTEMS where zero is reserved for use by the idle thread and
should not be used by applications). This patch removes one occurrence
in GNARL where a Priority object was initialized to zero instead of
Priority'First, and adjusts the Priority type on RTEMS to prevent the
use of priority level zero. By contrast,
System.Tasking.Unspecified_Priority is hardcoded as -1 since it is used
in init.c, which does not have access to the Priority type.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnarl/s-taskin.adb (Initialize_ATCB): Initialize
T.Common.Current_Priority to Priority'First.
* libgnarl/s-taskin.ads (Unspecified_Priority): Redefined as -1.
* libgnat/system-rtems.ads: Start priority range from 1, as 0 is
reserved by the operating system.diff --git a/gcc/ada/libgnarl/s-taskin.adb b/gcc/ada/libgnarl/s-taskin.adb
--- a/gcc/ada/libgnarl/s-taskin.adb
+++ b/gcc/ada/libgnarl/s-taskin.adb
@@ -127,7 +127,7 @@ package body System.Tasking is
   end if;
   pragma Assert (T.Common.Domain /= null);
 
-  T.Common.Current_Priority := 0;
+  T.Common.Current_Priority := Priority'First;
   T.Common.Protected_Action_Nesting := 0;
   T.Common.Call := null;
   T.Common.Task_Arg := Task_Arg;


diff --git a/gcc/ada/libgnarl/s-taskin.ads b/gcc/ada/libgnarl/s-taskin.ads
--- a/gcc/ada/libgnarl/s-taskin.ads
+++ b/gcc/ada/libgnarl/s-taskin.ads
@@ -773,7 +773,10 @@ package System.Tasking is
-- Priority info --
---
 
-   Unspecified_Priority : constant Integer := System.Priority'First - 1;
+   Unspecified_Priority : constant Integer := -1;
+   --  Indicates that a task has an unspecified priority. This is hardcoded as
+   --  -1 rather than System.Priority'First - 1 as the value needs to be used
+   --  in init.c to specify that the main task has no specified priority.
 
Priority_Not_Boosted : constant Integer := System.Priority'First - 1;
--  Definition of Priority actually has to come from the RTS configuration


diff --git a/gcc/ada/libgnat/system-rtems.ads b/gcc/ada/libgnat/system-rtems.ads
--- a/gcc/ada/libgnat/system-rtems.ads
+++ b/gcc/ada/libgnat/system-rtems.ads
@@ -109,15 +109,13 @@ package System is
-- hardware priority levels.  Protected Object ceilings can
-- override these values.
--  245is used by the Interrupt_Manager task
-   --  0  is reserved for the RTEMS IDLE task and really should not
-   -- be accessible from Ada but GNAT initializes
-   -- Current_Priority to 0 so it must be valid
+   --  0  is reserved for the RTEMS IDLE task
 
Max_Priority   : constant Positive := 244;
Max_Interrupt_Priority : constant Positive := 254;
 
-   subtype Any_Priority   is Integer  range   0 .. 254;
-   subtype Priority   is Any_Priority range   0 .. 244;
+   subtype Any_Priority   is Integer  range   1 .. 254;
+   subtype Priority   is Any_Priority range   1 .. 244;
subtype Interrupt_Priority is Any_Priority range 245 .. 254;
 
Default_Priority : constant Priority := 122;

[Ada] ACATS BDC1002 shall not error on arbitrary aspect

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

When giving an arbitrary

pragma Restrictions (No_Specification_of_Aspect => Future_Aspect);

Future_Aspect shall not be rejected. Nevertheless a warning shall be
emitted. In case the unknown aspect might be a misspelling, a hint
should be emitted accordingly.

To ease this spell-checking, Aspect_Spell_Check and
Attribute_Spell_Check are introduced.  Introduce a Bad_Aspect function
similar to Bad_Attribute.

The expression `Get_Aspect_Id (N) /= No_Aspect` is used enough to
introduce the wrapper `Is_Aspect_Id` as is done with
`Is_Attribute_Name`.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* aspects.adb, aspects.ads (Is_Aspect_Id): New function.
* namet-sp.ads, namet-sp.adb (Aspect_Spell_Check,
Attribute_Spell_Check): New Functions.
* par-ch13.adb (Possible_Misspelled_Aspect): Removed.
(With_Present): Use Aspect_Spell_Check, use Is_Aspect_Id.
(Get_Aspect_Specifications): Use Aspect_Spell_Check,
Is_Aspect_Id, Bad_Aspect.
* par-sync.adb (Resync_Past_Malformed_Aspect): Use Is_Aspect_Id.
* sem_ch13.adb (Check_One_Attr): Use Is_Aspect_Id.
* sem_prag.adb (Process_Restrictions_Or_Restriction_Warnings):
Introduce the Process_No_Specification_Of_Aspect, emit a warning
instead of an error on unknown aspect, hint for typos.
Introduce Process_No_Use_Of_Attribute to add spell check for
attributes too.
(Set_Error_Msg_To_Profile_Name): Use Is_Aspect_Id.
* sem_util.adb (Bad_Attribute): Use Attribute_Spell_Check.
(Bad_Aspect): New function.
* sem_util.ads (Bad_Aspect): New function.diff --git a/gcc/ada/aspects.adb b/gcc/ada/aspects.adb
--- a/gcc/ada/aspects.adb
+++ b/gcc/ada/aspects.adb
@@ -323,6 +323,16 @@ package body Aspects is
   return Present (Find_Aspect (Id, A, Class_Present => Class_Present));
end Has_Aspect;
 
+   --
+   -- Is_Aspect_Id --
+   --
+
+   function Is_Aspect_Id (Aspect : Name_Id) return Boolean is
+ (Get_Aspect_Id (Aspect) /= No_Aspect);
+
+   function Is_Aspect_Id (Aspect : Node_Id) return Boolean is
+ (Get_Aspect_Id (Aspect) /= No_Aspect);
+
--
-- Move_Aspects --
--


diff --git a/gcc/ada/aspects.ads b/gcc/ada/aspects.ads
--- a/gcc/ada/aspects.ads
+++ b/gcc/ada/aspects.ads
@@ -773,6 +773,14 @@ package Aspects is
--  Given an aspect specification, return the corresponding aspect_id value.
--  If the name does not match any aspect, return No_Aspect.
 
+   function Is_Aspect_Id (Aspect : Name_Id) return Boolean;
+   pragma Inline (Is_Aspect_Id);
+   --  Return True if a corresponding aspect id exists
+
+   function Is_Aspect_Id (Aspect : Node_Id) return Boolean;
+   pragma Inline (Is_Aspect_Id);
+   --  Return True if a corresponding aspect id exists
+

-- Delaying Evaluation of Aspects --



diff --git a/gcc/ada/namet-sp.adb b/gcc/ada/namet-sp.adb
--- a/gcc/ada/namet-sp.adb
+++ b/gcc/ada/namet-sp.adb
@@ -23,6 +23,8 @@
 --  --
 --
 
+with Aspects;
+with Snames;
 with System.WCh_Cnv; use System.WCh_Cnv;
 
 with GNAT.UTF_32_Spelling_Checker;
@@ -44,6 +46,44 @@ package body Namet.Sp is
--  either Name_Buffer or Name_Len. The result is in Result (1 .. Length).
--  The caller must ensure that the result buffer is long enough.
 
+   
+   -- Aspect_Spell_Check --
+   
+
+   function Aspect_Spell_Check (Name : Name_Id) return Boolean is
+ (Aspect_Spell_Check (Name) /= No_Name);
+
+   function Aspect_Spell_Check (Name : Name_Id) return Name_Id is
+  use Aspects;
+   begin
+  for J in Aspect_Id_Exclude_No_Aspect loop
+ if Is_Bad_Spelling_Of (Name, Aspect_Names (J)) then
+return Aspect_Names (J);
+ end if;
+  end loop;
+
+  return No_Name;
+   end Aspect_Spell_Check;
+
+   ---
+   -- Attribute_Spell_Check --
+   ---
+
+   function Attribute_Spell_Check (N : Name_Id) return Boolean is
+ (Attribute_Spell_Check (N) /= No_Name);
+
+   function Attribute_Spell_Check (N : Name_Id) return Name_Id is
+  use Snames;
+   begin
+  for J in First_Attribute_Name .. Last_Attribute_Name loop
+ if Is_Bad_Spelling_Of (N, J) then
+return J;
+ end if;
+  end loop;
+
+  return No_Name;
+   end Attribute_Spell_Check;
+

-- Get_Name_String_UTF_32 --



diff --git a/gcc/ada/namet-sp.ads b/gcc/ada/namet-sp.ads
--- a/gcc/ada/namet-sp.ads
+++ b/gcc/ada/namet-sp.ads
@@ -31,6 +31,20 @@
 
 package Namet.Sp is
 
+   function Aspect_Spell_Check (Name : Name_Id) return Boolean;
+   --  Returns

[Ada] Avoid warnings regarding rep clauses in generics

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

Representation-related node fields are not set for types in generic
units, so we should not warn based on the values of such fields. Also
avoid printing the values of such fields for -gnatR.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* repinfo.adb (List_Common_Type_Info, List_Object_Info): Add
check for In_Generic_Scope.
(List_Component_Layout): Check for known static values.
* sem_ch13.adb (Check_Record_Representation_Clause): Add check
for In_Generic_Scope.diff --git a/gcc/ada/repinfo.adb b/gcc/ada/repinfo.adb
--- a/gcc/ada/repinfo.adb
+++ b/gcc/ada/repinfo.adb
@@ -38,6 +38,7 @@ with Output; use Output;
 with Osint.C;use Osint.C;
 with Sem_Aux;use Sem_Aux;
 with Sem_Eval;   use Sem_Eval;
+with Sem_Util;
 with Sinfo;  use Sinfo;
 with Sinfo.Nodes;use Sinfo.Nodes;
 with Sinfo.Utils;use Sinfo.Utils;
@@ -426,11 +427,14 @@ package body Repinfo is
  end if;
 
   --  Alignment is not always set for task, protected, and class-wide
-  --  types.
+  --  types. Representation aspects are not computed for types in a
+  --  generic unit.
 
   else
  pragma Assert
-   (Is_Concurrent_Type (Ent) or else Is_Class_Wide_Type (Ent));
+   (Is_Concurrent_Type (Ent) or else
+  Is_Class_Wide_Type (Ent) or else
+  Sem_Util.In_Generic_Scope (Ent));
   end if;
end List_Common_Type_Info;
 
@@ -902,6 +906,13 @@ package body Repinfo is
 
procedure List_Object_Info (Ent : Entity_Id) is
begin
+  --  The information has not been computed in a generic unit, so don't try
+  --  to print it.
+
+  if Sem_Util.In_Generic_Scope (Ent) then
+ return;
+  end if;
+
   Write_Separator;
 
   if List_Representation_Info_To_JSON then
@@ -1176,13 +1187,17 @@ package body Repinfo is
 Write_Str (" range  ");
  end if;
 
- Sbit := Starting_First_Bit + Fbit;
+ if Known_Static_Normalized_First_Bit (Ent) then
+Sbit := Starting_First_Bit + Fbit;
 
- if Sbit >= SSU then
-Sbit := Sbit - SSU;
- end if;
+if Sbit >= SSU then
+   Sbit := Sbit - SSU;
+end if;
 
- UI_Write (Sbit, Decimal);
+UI_Write (Sbit, Decimal);
+ else
+Write_Unknown_Val;
+ end if;
 
  if List_Representation_Info_To_JSON then
 Write_Line (", ");


diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -12618,9 +12618,11 @@ package body Sem_Ch13 is
   end if;
 
   --  Skip the following warnings if overlap was detected; programmer
-  --  should fix the errors first.
+  --  should fix the errors first. Also skip the warnings for types in
+  --  generics, because their representation information is not fully
+  --  computed.
 
-  if not Overlap_Detected then
+  if not Overlap_Detected and then not In_Generic_Scope (Rectype) then
  --  Check for record holes (gaps)
 
  if Warn_On_Record_Holes then

[Ada] Fix comments about expansion of array equality

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

Expansion of array equality involves two index variables.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Expand_Array_Equality): Fix inconsistent casing
in comment about the template for expansion of array equality;
now we use lower case for true/false/boolean.
(Handle_One_Dimension): Fix comment about the template for
expansion of array equality.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -1541,14 +1541,14 @@ package body Exp_Ch4 is
--  and then
--(B'length (1) = 0 or else B'length (2) = 0)
-- then
-   --return True;-- RM 4.5.2(22)
+   --return true;-- RM 4.5.2(22)
-- end if;
 
-- if A'length (1) /= B'length (1)
--   or else
--   A'length (2) /= B'length (2)
-- then
-   --return False;   -- RM 4.5.2(23)
+   --return false;   -- RM 4.5.2(23)
-- end if;
 
-- declare
@@ -1638,6 +1638,7 @@ package body Exp_Ch4 is
   --  This procedure returns the following code
   --
   --declare
+  --   An : Index_T := A'First (N);
   --   Bn : Index_T := B'First (N);
   --begin
   --   loop

[Ada] Avoid warnings regarding rep clauses in generics -- follow-on

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

Codepeer is complaining about uninitialized variables.  This fixes it.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* repinfo.adb (List_Component_Layout): Initialize Sbit.diff --git a/gcc/ada/repinfo.adb b/gcc/ada/repinfo.adb
--- a/gcc/ada/repinfo.adb
+++ b/gcc/ada/repinfo.adb
@@ -1123,7 +1123,7 @@ package body Repinfo is
  Npos  : constant Uint := Normalized_Position (Ent);
  Fbit  : constant Uint := Normalized_First_Bit (Ent);
  Spos  : Uint;
- Sbit  : Uint;
+ Sbit  : Uint := No_Uint;
  Lbit  : Uint;
 
   begin

[Ada] Warn for bidirectional characters

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

Bidirectional characters can cause security vulnerabilities, as
explained in the paper mentioned in a comment in this patch.
Therefore, we warn if such characters appear in string_literals,
character_literals, or comments.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* scng.adb (Check_Bidi): New procedure to give warning. Note
that this is called only for non-ASCII characters, so should not
be an efficiency issue.
(Slit): Call Check_Bidi for wide characters in string_literals.
(Minus_Case): Call Check_Bidi for wide characters in comments.
(Char_Literal_Case): Call Check_Bidi for wide characters in
character_literals.  Move Accumulate_Checksum down, because
otherwise, if Err is True, the Code is uninitialized.
* errout.ads: Make the obsolete nature of "Insertion character
?" more prominent; one should not have to read several
paragraphs before finding out that it's obsolete.diff --git a/gcc/ada/errout.ads b/gcc/ada/errout.ads
--- a/gcc/ada/errout.ads
+++ b/gcc/ada/errout.ads
@@ -275,7 +275,7 @@ package Errout is
--  contain subprograms to be inlined in the main program. It is also
--  used by the Compiler_Unit_Warning pragma for similar reasons.
 
-   --Insertion character ? (Question: warning message)
+   --Insertion character ? (Question: warning message -- OBSOLETE)
--  The character ? appearing anywhere in a message makes the message
--  warning instead of a normal error message, and the text of the
--  message will be preceded by "warning:" in the normal case. The
@@ -302,7 +302,7 @@ package Errout is
--  clear that the continuation is part of a warning message, but it is
--  not necessary to go through any computational effort to include it.
--
-   --  Note: this usage is obsolete, use ?? ?*? ?$? ?x? ?.x? ?_x? to
+   --  Note: this usage is obsolete; use ?? ?*? ?$? ?x? ?.x? ?_x? to
--  specify the string to be added when Warn_Doc_Switch is set to True.
--  If this switch is True, then for simple ? messages it has no effect.
--  This simple form is to ease transition and may be removed later


diff --git a/gcc/ada/scng.adb b/gcc/ada/scng.adb
--- a/gcc/ada/scng.adb
+++ b/gcc/ada/scng.adb
@@ -322,6 +322,49 @@ package body Scng is
   --  Returns True if the scan pointer is pointing to the start of a wide
   --  character sequence, does not modify the scan pointer in any case.
 
+  procedure Check_Bidi (Code : Char_Code);
+  --  Give a warning if Code is a bidirectional character, which can cause
+  --  security vulnerabilities. See the following article:
+  --
+  --  @article{boucher_trojansource_2021,
+  --  title = {Trojan {Source}: {Invisible} {Vulnerabilities}},
+  --  author = {Nicholas Boucher and Ross Anderson},
+  --  year = {2021},
+  --  journal = {Preprint},
+  --  eprint = {2111.00169},
+  --  archivePrefix = {arXiv},
+  --  primaryClass = {cs.CR},
+  --  url = {https://arxiv.org/abs/2111.00169}
+  --  }
+
+  
+  -- Check_Bidi --
+  
+
+  type Bidi_Characters is
+(LRE, RLE, LRO, RLO, LRI, RLI, FSI, PDF, PDI);
+  Bidi_Character_Codes : constant array (Bidi_Characters) of Char_Code :=
+(LRE => 16#202A#,
+ RLE => 16#202B#,
+ LRO => 16#202D#,
+ RLO => 16#202E#,
+ LRI => 16#2066#,
+ RLI => 16#2067#,
+ FSI => 16#2068#,
+ PDF => 16#202C#,
+ PDI => 16#2069#);
+  --  Above are the bidirectional characters, along with their Unicode code
+  --  points.
+
+  procedure Check_Bidi (Code : Char_Code) is
+  begin
+ for Bidi_Code of Bidi_Character_Codes loop
+if Code = Bidi_Code then
+   Error_Msg ("??bidirectional wide character", Wptr);
+end if;
+ end loop;
+  end Check_Bidi;
+
   ---
   -- Double_Char_Token --
   ---
@@ -1070,6 +1113,8 @@ package body Scng is
   if Err then
  Error_Illegal_Wide_Character;
  Code := Get_Char_Code (' ');
+  else
+ Check_Bidi (Code);
   end if;
 
   Accumulate_Checksum (Code);
@@ -1611,11 +1656,11 @@ package body Scng is
 
   elsif Start_Of_Wide_Character then
  declare
-Wptr : constant Source_Ptr := Scan_Ptr;
 Code : Char_Code;
 Err  : Boolean;
 
  begin
+Wptr := Scan_Ptr;
 Scan_Wide (Source, Scan_Ptr, Code, Err);
 
 --  If not well formed wide character, then just skip
@@ -1629,6 +1674,8 @@ package body Scng is

[Ada] Minor cleanup in translation of calls to subprograms

2021-11-10 Thread Pierre-Marie de Rodat via Gcc-patches

This gets rid of the DECL_STUBBED_P macro and adjusts Call_to_gnu.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gcc-interface/ada-tree.h (DECL_STUBBED_P): Delete.
* gcc-interface/decl.c (gnat_to_gnu_entity): Do not set it.
* gcc-interface/trans.c (Call_to_gnu): Use GNAT_NAME local variable
and adjust accordingly.  Replace test on DECL_STUBBED_P with direct
test on Convention and move it down in the processing.diff --git a/gcc/ada/gcc-interface/ada-tree.h b/gcc/ada/gcc-interface/ada-tree.h
--- a/gcc/ada/gcc-interface/ada-tree.h
+++ b/gcc/ada/gcc-interface/ada-tree.h
@@ -410,10 +410,6 @@ do {		   \
 
 /* Flags added to decl nodes.  */
 
-/* Nonzero in a FUNCTION_DECL that represents a stubbed function
-   discriminant.  */
-#define DECL_STUBBED_P(NODE) DECL_LANG_FLAG_0 (FUNCTION_DECL_CHECK (NODE))
-
 /* Nonzero in a VAR_DECL if it is guaranteed to be constant after having
been elaborated and TREE_READONLY is not set on it.  */
 #define DECL_READONLY_ONCE_ELAB(NODE) DECL_LANG_FLAG_0 (VAR_DECL_CHECK (NODE))


diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -4095,19 +4095,14 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	else if (extern_flag && gnu_ext_name == DECL_NAME (realloc_decl))
 	  gnu_decl = realloc_decl;
 	else
-	  {
-		gnu_decl
-		  = create_subprog_decl (gnu_entity_name, gnu_ext_name,
-	 gnu_type, gnu_param_list,
-	 inline_status, public_flag,
-	 extern_flag, artificial_p,
-	 debug_info_p,
-	 definition && imported_p, attr_list,
-	 gnat_entity);
-
-		DECL_STUBBED_P (gnu_decl)
-		  = (Convention (gnat_entity) == Convention_Stubbed);
-	  }
+	  gnu_decl
+		= create_subprog_decl (gnu_entity_name, gnu_ext_name,
+   gnu_type, gnu_param_list,
+   inline_status, public_flag,
+   extern_flag, artificial_p,
+   debug_info_p,
+   definition && imported_p, attr_list,
+   gnat_entity);
 	  }
   }
   break;


diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -4453,13 +4453,14 @@ static tree
 Call_to_gnu (Node_Id gnat_node, tree *gnu_result_type_p, tree gnu_target,
 	 atomic_acces_t atomic_access, bool atomic_sync)
 {
+  const Node_Id gnat_name = Name (gnat_node);
   const bool function_call = (Nkind (gnat_node) == N_Function_Call);
   const bool returning_value = (function_call && !gnu_target);
   /* The GCC node corresponding to the GNAT subprogram name.  This can either
  be a FUNCTION_DECL node if we are dealing with a standard subprogram call,
  or an indirect reference expression (an INDIRECT_REF node) pointing to a
  subprogram.  */
-  tree gnu_subprog = gnat_to_gnu (Name (gnat_node));
+  tree gnu_subprog = gnat_to_gnu (gnat_name);
   /* The FUNCTION_TYPE node giving the GCC type of the subprogram.  */
   tree gnu_subprog_type = TREE_TYPE (gnu_subprog);
   /* The return type of the FUNCTION_TYPE.  */
@@ -4482,50 +4483,16 @@ Call_to_gnu (Node_Id gnat_node, tree *gnu_result_type_p, tree gnu_target,
   atomic_acces_t aa_type;
   bool aa_sync;
 
-  gcc_assert (FUNC_OR_METHOD_TYPE_P (gnu_subprog_type));
-
-  /* If we are calling a stubbed function, raise Program_Error, but Elaborate
- all our args first.  */
-  if (TREE_CODE (gnu_subprog) == FUNCTION_DECL && DECL_STUBBED_P (gnu_subprog))
-{
-  tree call_expr = build_call_raise (PE_Stubbed_Subprogram_Called,
-	 gnat_node, N_Raise_Program_Error);
-
-  for (gnat_actual = First_Actual (gnat_node);
-	   Present (gnat_actual);
-	   gnat_actual = Next_Actual (gnat_actual))
-	add_stmt (gnat_to_gnu (gnat_actual));
-
-  if (returning_value)
-	{
-	  *gnu_result_type_p = gnu_result_type;
-	  return build1 (NULL_EXPR, gnu_result_type, call_expr);
-	}
-
-  return call_expr;
-}
-
-  if (TREE_CODE (gnu_subprog) == FUNCTION_DECL)
-{
-  /* For a call to a nested function, check the inlining status.  */
-  if (decl_function_context (gnu_subprog))
-	check_inlining_for_nested_subprog (gnu_subprog);
-
-  /* For a recursive call, avoid explosion due to recursive inlining.  */
-  if (gnu_subprog == current_function_decl)
-	DECL_DISREGARD_INLINE_LIMITS (gnu_subprog) = 0;
-}
-
-  /* The only way we can be making a call via an access type is if Name is an
+  /* The only way we can make a call via an access type is if GNAT_NAME is an
  explicit dereference.  In that case, get the list of formal args from the
  type the access type is pointing to.  Otherwise, get the formals from the
  entity being called.  */
-  if (Nkind (Name (gnat_node)) == N_Explicit_Dereference)
+  if (Nkind (gnat_name) == N_Explicit_Dereference)
 {
   const Entity_Id gnat_prefix_type
-	= Underlying_Type (Etype (Prefix (Name (gnat_no

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-10 Thread Richard Biener via Gcc-patches

On Tue, Nov 9, 2021 at 5:44 PM Martin Liška  wrote:
>
> On 11/9/21 14:37, Richard Biener wrote:
> > On Mon, Nov 8, 2021 at 8:45 PM Andrew MacLeod  wrote:
> >>
> >> On 11/8/21 10:05 AM, Martin Liška wrote:
> >>> On 9/28/21 22:39, Andrew MacLeod wrote:
>  In Theory, modifying the IL should be fine, it happens already in
>  places, but its not extensively tested under those conditions yet.
> >>>
> >>> Hello Andrew.
> >>>
> >>> I've just tried using a global gimple_ranger and it crashes when loop
> >>> unswitching duplicates
> >>> some BBs.
> >>>
> >>> Please try the attached patch for:
> >>
> >> hey Martin,
> >>
> >> try using this in your tree.  Since nothing else is using a growing BB
> >> right now, I'll let you work with it and see if everything works as
> >> expected before checking it in, just in case we need more tweaking.
> >> With this,
> >>
> >> make RUNTESTFLAGS=dg.exp=loop-unswitch*.c check-gcc
> >>
> >> runs clean.
> >>
> >>
> >> basically, I tried to grow it by either a factor of 10% for the current
> >> BB size when the grow is requested, or some double the needed extra
> >> size, or 128... whichever value is "maximum"That means it shoudnt be
> >> asking for tooo much each time, but also not a minimum amount.
> >>
> >> Im certainly open to suggestion on how much to grow it each time.
> >> Note the vector being grown is ONLY fo the SSA_NAme being asked for.. so
> >> it really an on-demand thing just for specific names, in your case,
> >> mostly just the switch index.
> >>
> >> Let me know how this works for you, and if you have any other issues.
> >
> > So I think in the end we shouldn't need the growing.  Ideally we'd do all
> > the analysis before the first transform, but for that we'd need ranger to
> > be able to "simplify" conditions based on a known true/false predicate
> > that's not yet in the IL.  Consider
> >
> >   for (;;)
> > {
> >  if (invariant < 3) // A
> >{
> > ...
> >}
> >  if (invariant < 5) // B
> >{
> > ...
> >}
> > }
> >
> > unswitch analysis will run into the condition 'A' and determine the loop
> > can be unswitched with the condition 'invariant < 3'.  To be able to
> > perform cost assessment and to avoid redundant unswitching we
> > want to determine that if we unswitch with 'invariant < 3' being
> > true then the condition at 'B' is true as well before actually inserting
> > the if (invariant < 3) outside of the loop.
> >
> > So I'm thinking of assigning a gimple_uid to each condition we want to
> > unswitch on and have an array indexed by the uid with meta-data on
> > the unswitch opportunity, the "related" conditions could be marked with
> > the same uid (or some other), and the folding result recorded so that
> > at transform time we can just do the appropriate replacement without
> > invoking ranger again.
>
> Calculating all this before transformation is quite ambitious based on the 
> code
> we have now.
>
> Note one can have in a loop:
>
> if (a > 100)
> ...
>
> switch (a)
> case 1000:
>   ...
> case 20:
>   ...
> case 200:
>   ...
>
> which means the first predicate effectively makes some cases unreachable. 
> Moreover
> one can have
>
> if (a > 100 && b < 300)
> ...
>
> and more complex conditions.

True - I guess we should do two things.

 1) keep simplify_using_entry_checks like code for symbolic conditions
 2) add integer ranges for unswitch conditions producing them, that
 includes all unswitching of switch stmts - we might be able to use
 the ranger queries (with global ranges) to simplify stmts with the
 known ranges as noted by Andrew

I do think that pre-computing the simplifications is what we should do
to be able to make the cost modeling sane.  What we can avoid
trying is evaluating multiple unswitch possibilities to pick the "best".

I think changing the code do to the analysis first should be done
before wiring in gcond support, even adding the additional 'range'
capability will be useful without that since the current code
wont figure out a > 5 is true when we unswitch on a > 3.

> >
> > Now, but how do we arrange for the ranger analysis here?
>
> That's likely something we need support from ranger, yes.
>
> >
> > We might also somehow want to remember that on the
> > 'invariant < 3' == false copy of the loop there's still the
> > unswitching opportunity on 'invariant < 5', but not on the
> > 'invariant < 5' == true copy.
> >
> > Currently unswitching uses a custom simplify_using_entry_checks
> > which tries to do simplification only after the fact (and so costing
> > also is far from costing the true cost and ordering of the opportunities
> > to do the best first is not implemented either).
>
> I'm sending updated version of the patch where I changed:
> - simplify_using_entry_checks is put back for the floating point expressions
> - all scans utilize scan-tree-dump-times
> - some new tests were added
> - global ranger is

[PATCH] x86: Update -mtune=alderlake

2021-11-10 Thread Cui,Lili via Gcc-patches

Hi Uros,

This patch is to update mtune for alderlake.

Bootstrap is ok, and no regressions for i386/x86-64 testsuite.

OK for master?

Update mtune for alderlake, Alder Lake Intel Hybrid Technology will not support
Intel® AVX-512. ISA features such as Intel® AVX, AVX-VNNI, Intel® AVX2, and
UMONITOR/UMWAIT/TPAUSE are supported.

gcc/ChangeLog

* config/i386/i386-options.c (m_CORE_AVX2): Remove Alderlake
from m_CORE_AVX2.
(processor_cost_table): Use alderlake_cost for Alderlake.
* config/i386/i386.c (ix86_sched_init_global): Handle Alderlake.
* config/i386/x86-tune-costs.h (struct processor_costs): Add alderlake
cost.
* config/i386/x86-tune-sched.c (ix86_issue_rate): Change Alderlake
issue rate to 4.
(ix86_adjust_cost): Handle Alderlake.
* config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Enable for Alderlake.
(X86_TUNE_PARTIAL_REG_DEPENDENCY): Likewise.
(X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Likewise.
(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Likewise.
(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
(X86_TUNE_MEMORY_MISMATCH_STALL): Likewise.
(X86_TUNE_USE_LEAVE): Likewise.
(X86_TUNE_PUSH_MEMORY): Likewise.
(X86_TUNE_USE_INCDEC): Likewise.
(X86_TUNE_INTEGER_DFMODE_MOVES): Likewise.
(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Likewise.
(X86_TUNE_USE_SAHF): Likewise.
(X86_TUNE_USE_BT): Likewise.
(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Likewise.
(X86_TUNE_ONE_IF_CONV_INSN): Likewise.
(X86_TUNE_AVOID_MFENCE): Likewise.
(X86_TUNE_USE_SIMODE_FIOP): Likewise.
(X86_TUNE_EXT_80387_CONSTANTS): Likewise.
(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Likewise.
(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Likewise.
(X86_TUNE_SSE_TYPELESS_STORES): Likewise.
(X86_TUNE_SSE_LOAD0_BY_PXOR): Likewise.
(X86_TUNE_AVOID_4BYTE_PREFIXES): Likewise.
(X86_TUNE_USE_GATHER): Disable for Alderlake.
(X86_TUNE_AVX256_MOVE_BY_PIECES): Likewise.
(X86_TUNE_AVX256_STORE_BY_PIECES): Likewise.
---
 gcc/config/i386/i386-options.c   |   4 +-
 gcc/config/i386/i386.c   |   1 +
 gcc/config/i386/x86-tune-costs.h | 120 +++
 gcc/config/i386/x86-tune-sched.c |   2 +
 gcc/config/i386/x86-tune.def |  58 +++
 5 files changed, 155 insertions(+), 30 deletions(-)

diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index e7a3bd4aaea..a8cc0664f11 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -131,7 +131,7 @@ along with GCC; see the file COPYING3.  If not see
   | m_ICELAKE_CLIENT | m_ICELAKE_SERVER | m_CASCADELAKE \
   | m_TIGERLAKE | m_COOPERLAKE | m_SAPPHIRERAPIDS \
   | m_ROCKETLAKE)
-#define m_CORE_AVX2 (m_HASWELL | m_SKYLAKE | m_ALDERLAKE | m_CORE_AVX512)
+#define m_CORE_AVX2 (m_HASWELL | m_SKYLAKE | m_CORE_AVX512)
 #define m_CORE_ALL (m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE | m_CORE_AVX2)
 #define m_GOLDMONT (HOST_WIDE_INT_1Uinteger and integer->SSE moves 
*/
+  6, 6,/* mask->integer and integer->mask 
moves */
+  {6, 6, 6},   /* cost of loading mask register
+  in QImode, HImode,

[PATCH] dwarf2out: Fix up field_byte_offset [PR101378]

2021-11-10 Thread Jakub Jelinek via Gcc-patches

Hi!

For PCC_BITFIELD_TYPE_MATTERS field_byte_offset has quite large code
to deal with it since many years ago (see it e.g. in GCC 3.2, although it
used to be on HOST_WIDE_INTs, then on double_ints, now on offset_ints).
But that code apparently isn't able to cope with members with empty class
types with [[no_unique_address]] attribute, because the empty classes have
non-zero type size but zero decl size and so one can end up from the
computation with negative offset or offset 1 byte smaller than it should be.
For !PCC_BITFIELD_TYPE_MATTERS, we just use
tree_result = byte_position (decl);
which seems exactly right even for the empty classes or anything which is
not a bitfield (and for which we don't add DW_AT_bit_offset attribute).
So, instead of trying to handle those no_unique_address members in the
current already very complicated code, this limits it to bitfields.

stor-layout.c PCC_BITFIELD_TYPE_MATTERS handling also affects only
bitfields, twice it checks DECL_BIT_FIELD and once DECL_BIT_FIELD_TYPE.

The only thing I'm unsure about is whether the test should be
DECL_BIT_FIELD or DECL_BIT_FIELD_TYPE should be tested.  I thought it
doesn't matter, but it seems stor-layout.c in some cases clears
DECL_BIT_FIELD if their TYPE_MODE can express the type exactly, and
dwarf2out.c (gen_field_die) uses
  if (DECL_BIT_FIELD_TYPE (decl))
to decide if DW_AT_bit_offset etc. attributes should be added.
So maybe I should go with && DECL_BIT_FIELD_TYPE (decl) instead.
On
struct S { int e; int a : 1, b : 7, c : 8, d : 16; } s;
struct T { int a : 1, b : 7; long long c : 8; int d : 16; } t;
it doesn't make a difference though on x86_64, ppc64le nor ppc64...

I think Ada has bitfields of aggregate types, so CCing Eric, though
I'd hope it doesn't have bitfields where type size is smaller than
field decl size like C++ has.

Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux
and powerpc64-linux and Pedro has tested it on GDB testsuite.

I can bootstrap/regtest the
+  && DECL_BIT_FIELD_TYPE (decl)
version too.

2021-11-10  Jakub Jelinek  

PR debug/101378
* dwarf2out.c (field_byte_offset): Do the PCC_BITFIELD_TYPE_MATTERS
handling only for DECL_BIT_FIELD decls.

* g++.dg/debug/dwarf2/pr101378.C: New test.

--- gcc/dwarf2out.c.jj  2021-11-05 10:19:46.339457342 +0100
+++ gcc/dwarf2out.c 2021-11-09 15:01:51.425437717 +0100
@@ -19646,6 +19646,7 @@ field_byte_offset (const_tree decl, stru
  properly dynamic byte offsets only when PCC bitfield type doesn't
  matter.  */
   if (PCC_BITFIELD_TYPE_MATTERS
+  && DECL_BIT_FIELD (decl)
   && TREE_CODE (DECL_FIELD_OFFSET (decl)) == INTEGER_CST)
 {
   offset_int object_offset_in_bits;
--- gcc/testsuite/g++.dg/debug/dwarf2/pr101378.C.jj 2021-11-09 
15:17:39.504975396 +0100
+++ gcc/testsuite/g++.dg/debug/dwarf2/pr101378.C2021-11-09 
15:17:28.067137556 +0100
@@ -0,0 +1,13 @@
+// PR debug/101378
+// { dg-do compile { target c++11 } }
+// { dg-options "-gdwarf-5 -dA" }
+// { dg-final { scan-assembler-times "0\[^0-9x\\r\\n\]* 
DW_AT_data_member_location" 1 } }
+// { dg-final { scan-assembler-times "1\[^0-9x\\r\\n\]* 
DW_AT_data_member_location" 1 } }
+// { dg-final { scan-assembler-times "2\[^0-9x\\r\\n\]* 
DW_AT_data_member_location" 1 } }
+// { dg-final { scan-assembler-not "-1\[^0-9x\\r\\n\]* 
DW_AT_data_member_location" } }
+
+struct E {};
+struct S
+{
+  [[no_unique_address]] E e, f, g;
+} s;

Jakub

Fix modref_tree::remap_params

2021-11-10 Thread Jan Hubicka via Gcc-patches

Hi,
this patch fixes wrong compare in remap_params which triggers a wrong
code with my followup patch.  This needs backporting to gcc11 as well
which I plan to do tomorrow.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

* ipa-modref-tree.h (modref_tree::remap_params): Fix off-by-one error.

diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
index be5efcbb68f..3e213b23d79 100644
--- a/gcc/ipa-modref-tree.h
+++ b/gcc/ipa-modref-tree.h
@@ -1139,7 +1139,7 @@ struct GTY((user)) modref_tree
size_t k;
modref_access_node *access_node;
FOR_EACH_VEC_SAFE_ELT (ref_node->accesses, k, access_node)
- if (access_node->parm_index > 0)
+ if (access_node->parm_index >= 0)
{
  if (access_node->parm_index < (int)map->length ())
access_node->parm_index = (*map)[access_node->parm_index];

Re: [PATCH] dwarf2out: Fix up field_byte_offset [PR101378]

2021-11-10 Thread Eric Botcazou via Gcc-patches

> I think Ada has bitfields of aggregate types, so CCing Eric, though
> I'd hope it doesn't have bitfields where type size is smaller than
> field decl size like C++ has.

Assuming your sentence is written in the right sense :-) then, no, in Ada bit-
fields always have DECL_SIZE (bf) <= TYPE_SIZE (TREE_TYPE (bf)).

-- 
Eric Botcazou

Re: [PATH][_GLIBCXX_DEBUG] Fix unordered container merge

2021-11-10 Thread Jonathan Wakely via Gcc-patches

On Wed, 10 Nov 2021 at 05:47, François Dumont  wrote:

> On 09/11/21 5:25 pm, Jonathan Wakely wrote:
>
>
>
> On Mon, 8 Nov 2021 at 21:36, François Dumont  wrote:
>
>> Yet another version this time with only 1 guard implementation. The
>> predicate to invalidate the safe iterators has been externalized.
>>
>> Ok to commit ?
>>
>
> I like this version a lot - thanks for persisting with it.
>
> OK to commit, thanks.
>
>
> As an aside ...
>
> --- a/libstdc++-v3/testsuite/util/testsuite_abi.h
> +++ b/libstdc++-v3/testsuite/util/testsuite_abi.h
> @@ -24,7 +24,11 @@
>  #include 
>  #if __cplusplus >= 201103L
>  # include 
> +# ifdef _GLIBCXX_DEBUG
> +namespace unord = std::_GLIBCXX_STD_C;
> +# else
>  namespace unord = std;
> +# endif
>  #else
>  # include 
>  namespace unord = std::tr1;
>
>
> Several times I've been annoyed by the fact that we don't have a way to
> refer to std::_GLIBCXX_STD_C::vector etc. that is always valid, in normal
> mode and debug mode.
>
> Maybe we should add:
>
> namespace std { namespace _GLIBCXX_STD_C = ::std; }
>
> That way we can refer to std::_GLIBCXX_STD_C::foo in normal mode, and it
> will mean the same thing as in debug mode. So we don't need to use #if
> conditions like this.
>
>
> Good idea, I'll prepare it.
>

Alternatively we could do this:

namespace std
{
namespace __cxx1998 { }
#ifdef _GLIBCXX_DEBUG
namespace __cont = __cxx1998;
#else
namespace __cont = ::std::
#endif
}

And then define this so it's always the same name:
#define _GLIBCXX_STD_C __cont

Then we can refer to std::_GLIBCXX_STD_C::vector in any context, and it
refers to the right thing. And we could also stop using the SHOUTING macro,
and just refer to std::__cont::vector instead.

We could also make this work as std::__cxx1998::vector, but maybe we should
move away from the "1998" name, because it doesn't make much sense for
forward_list and unordered_map which are not in C++98.

[PATCH] gimple-fold: Smarter optimization of _chk variants

2021-11-10 Thread Siddhesh Poyarekar

Instead of comparing LEN and SIZE only if they are constants, use their
ranges to decide if LEN will always be lower than or same as SIZE.

This change ends up putting the stringop-overflow warning line number
against the strcpy implementation, so adjust the warning check to be
line number agnostic.

gcc/ChangeLog:

* gimple-fold.c (known_safe): New function.
(gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk,
gimple_fold_builtin_snprintf_chk,
gimple_fold_builtin_sprintf_chk): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/Wobjsize-1.c: Make warning change line agnostic.
* gcc.dg/builtin-chk-fold.c: New test.

Signed-off-by: Siddhesh Poyarekar 
---
Testing:

- gcc.dg shows no new regressions on x86_64, a full bootstrap and test
  run are in progress.

 gcc/gimple-fold.c   | 186 +---
 gcc/testsuite/gcc.dg/Wobjsize-1.c   |   5 +-
 gcc/testsuite/gcc.dg/builtin-chk-fold.c |  21 +++
 3 files changed, 94 insertions(+), 118 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-chk-fold.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 6e25a7c05db..7399b49d00f 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2987,6 +2987,25 @@ gimple_fold_builtin_fputs (gimple_stmt_iterator *gsi,
   return false;
 }
 
+/* Return true if LEN is known to be less than or equal to SIZE at compile time
+   and false otherwise.  Emit a warning if LEN is known to be greater than SIZE
+   at compile time.  */
+
+static bool
+known_safe (gimple *stmt, tree len, tree size)
+{
+  if (len == NULL_TREE)
+return false;
+
+  wide_int size_range[2];
+  wide_int len_range[2];
+  if (get_range (len, stmt, len_range) && get_range (size, stmt, size_range)
+  && wi::leu_p (len_range[1], size_range[0]))
+return true;
+
+  return false;
+}
+
 /* Fold a call to the __mem{cpy,pcpy,move,set}_chk builtin.
DEST, SRC, LEN, and SIZE are the arguments to the call.
IGNORE is true, if return value can be ignored.  FCODE is the BUILT_IN_*
@@ -3024,39 +3043,24 @@ gimple_fold_builtin_memory_chk (gimple_stmt_iterator 
*gsi,
}
 }
 
-  if (! tree_fits_uhwi_p (size))
-return false;
-
   tree maxlen = get_maxval_strlen (len, SRK_INT_VALUE);
-  if (! integer_all_onesp (size))
+  if (! integer_all_onesp (size)
+  && !known_safe (stmt, len, size) && !known_safe (stmt, maxlen, size))
 {
-  if (! tree_fits_uhwi_p (len))
+  /* MAXLEN and LEN both cannot be proved to be less than SIZE, at
+least try to optimize (void) __mempcpy_chk () into
+(void) __memcpy_chk () */
+  if (fcode == BUILT_IN_MEMPCPY_CHK && ignore)
{
- /* If LEN is not constant, try MAXLEN too.
-For MAXLEN only allow optimizing into non-_ocs function
-if SIZE is >= MAXLEN, never convert to __ocs_fail ().  */
- if (maxlen == NULL_TREE || ! tree_fits_uhwi_p (maxlen))
-   {
- if (fcode == BUILT_IN_MEMPCPY_CHK && ignore)
-   {
- /* (void) __mempcpy_chk () can be optimized into
-(void) __memcpy_chk ().  */
- fn = builtin_decl_explicit (BUILT_IN_MEMCPY_CHK);
- if (!fn)
-   return false;
+ fn = builtin_decl_explicit (BUILT_IN_MEMCPY_CHK);
+ if (!fn)
+   return false;
 
- gimple *repl = gimple_build_call (fn, 4, dest, src, len, 
size);
- replace_call_with_call_and_fold (gsi, repl);
- return true;
-   }
- return false;
-   }
+ gimple *repl = gimple_build_call (fn, 4, dest, src, len, size);
+ replace_call_with_call_and_fold (gsi, repl);
+ return true;
}
-  else
-   maxlen = len;
-
-  if (tree_int_cst_lt (size, maxlen))
-   return false;
+  return false;
 }
 
   fn = NULL_TREE;
@@ -3126,61 +3130,47 @@ gimple_fold_builtin_stxcpy_chk (gimple_stmt_iterator 
*gsi,
   return true;
 }
 
-  if (! tree_fits_uhwi_p (size))
-return false;
-
   tree maxlen = get_maxval_strlen (src, SRK_STRLENMAX);
   if (! integer_all_onesp (size))
 {
   len = c_strlen (src, 1);
-  if (! len || ! tree_fits_uhwi_p (len))
+  if (!known_safe (stmt, len, size) && !known_safe (stmt, maxlen, size))
{
- /* If LEN is not constant, try MAXLEN too.
-For MAXLEN only allow optimizing into non-_ocs function
-if SIZE is >= MAXLEN, never convert to __ocs_fail ().  */
- if (maxlen == NULL_TREE || ! tree_fits_uhwi_p (maxlen))
+ if (fcode == BUILT_IN_STPCPY_CHK)
{
- if (fcode == BUILT_IN_STPCPY_CHK)
-   {
- if (! ignore)
-   return false;
-
- /* If return value of __stpcpy_chk is ignored,
-optimize into __strcpy_chk.  */
-

Re: [PATCH] rs6000/doc: Rename future cpu with power10

2021-11-10 Thread Kewen.Lin via Gcc-patches

Hi Segher,

on 2021/11/10 下午4:52, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Nov 10, 2021 at 01:41:25PM +0800, Kewen.Lin wrote:
>> Commmit 5d9d0c94588 renamed future to power10 and ace60939fd2
>> updated the documentation for "future" renaming.  This patch
>> is to rename the remaining "future architecture" references in
>> documentation.
> 
> Good find :-)
> 
>> @@ -28613,7 +28613,7 @@ the offset with a symbol reference to a canary in 
>> the TLS block.
>>  @opindex mpcrel
>>  @opindex mno-pcrel
>>  Generate (do not generate) pc-relative addressing when the option
>> -@option{-mcpu=future} is used.  The @option{-mpcrel} option requires
>> +@option{-mcpu=power10} is used.  The @option{-mpcrel} option requires
>>  that the medium code model (@option{-mcmodel=medium}) and prefixed
>>  addressing (@option{-mprefixed}) options are enabled.
> 
> It still sounds strange, and factually incorrect really: the -mpcrel
> option says to use pc-relative processing, no matter if -mcpu=power10 is
> used or not.  For example, it will work fine with later CPUs as well.
> 

Good point!  The comment is also applied for mma, prefixed and float128.

> So maybe this should just delete from after "addressing" to the end of
> that line?  It already says what the prerequisites are, on the very next
> line :-)
> 

Thanks for the suggestion.  The updated version is inlined as below.
Not sure the update for float128 looks good enough to you.

Could you please have a look again?

BR,
Kewen
-
gcc/ChangeLog:

* doc/invoke.texi: Change references to "future cpu" to "power10",
"-mcpu=future" to "-mcpu=power10".  Adjust with "later cpu_type".

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2ea23d07c4c..aa0a20924bf 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -27597,7 +27597,7 @@ Supported values for @var{cpu_type} are @samp{401}, 
@samp{403},
 @samp{e6500}, @samp{ec603e}, @samp{G3}, @samp{G4}, @samp{G5},
 @samp{titan}, @samp{power3}, @samp{power4}, @samp{power5}, @samp{power5+},
 @samp{power6}, @samp{power6x}, @samp{power7}, @samp{power8},
-@samp{power9}, @samp{future}, @samp{powerpc}, @samp{powerpc64},
+@samp{power9}, @samp{power10}, @samp{powerpc}, @samp{powerpc64},
 @samp{powerpc64le}, @samp{rs64}, and @samp{native}.

 @option{-mcpu=powerpc}, @option{-mcpu=powerpc64}, and
@@ -27779,10 +27779,10 @@ Enable/disable the @var{__float128} keyword for IEEE 
128-bit floating point
 and use either software emulation for IEEE 128-bit floating point or
 hardware instructions.

-The VSX instruction set (@option{-mvsx}, @option{-mcpu=power7},
-@option{-mcpu=power8}), or @option{-mcpu=power9} must be enabled to
-use the IEEE 128-bit floating point support.  The IEEE 128-bit
-floating point support only works on PowerPC Linux systems.
+The VSX instruction set (@option{-mvsx}, @option{-mcpu=power7} (or later
+@var{cpu_type})) must be enabled to use the IEEE 128-bit floating point
+support.  The IEEE 128-bit floating point support only works on PowerPC
+Linux systems.

 The default for @option{-mfloat128} is enabled on PowerPC Linux
 systems using the VSX instruction set, and disabled on other systems.
@@ -28612,24 +28612,25 @@ the offset with a symbol reference to a canary in the 
TLS block.
 @itemx -mno-pcrel
 @opindex mpcrel
 @opindex mno-pcrel
-Generate (do not generate) pc-relative addressing when the option
-@option{-mcpu=future} is used.  The @option{-mpcrel} option requires
-that the medium code model (@option{-mcmodel=medium}) and prefixed
-addressing (@option{-mprefixed}) options are enabled.
+Generate (do not generate) pc-relative addressing.  The @option{-mpcrel}
+option requires that the medium code model (@option{-mcmodel=medium})
+and prefixed addressing (@option{-mprefixed}) options are enabled.

 @item -mprefixed
 @itemx -mno-prefixed
 @opindex mprefixed
 @opindex mno-prefixed
 Generate (do not generate) addressing modes using prefixed load and
-store instructions when the option @option{-mcpu=future} is used.
+store instructions.  The @option{-mprefixed} option requires that
+the option @option{-mcpu=power10} (or later @var{cpu_type}) is enabled.

 @item -mmma
 @itemx -mno-mma
 @opindex mmma
 @opindex mno-mma
-Generate (do not generate) the MMA instructions when the option
-@option{-mcpu=future} is used.
+Generate (do not generate) the MMA instructions.  The @option{-mma}
+option requires that the option @option{-mcpu=power10} (or later
+@var{cpu_type}) is enabled.

 @item -mrop-protect
 @itemx -mno-rop-protect

Re: [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction

2021-11-10 Thread Richard Biener via Gcc-patches

On Sun, Oct 24, 2021 at 9:03 PM Di Zhao OS
 wrote:
>
> Hi,
>
> Attached is a new version of the patch, mainly for improving performance
> and simplifying the code.

The patch doesn't apply anymore, can you update it please?

I see the new ssa-fre-101.c test already passing without the patch.
Likewise ssa-fre-100.c and ssa-fre-102.c would PASS if you scan
the pass dump after fre1 which is evrp so it seems that evrp already
handles the equivalences (likely with the relation oracle) now?
I'm sure there are second order effects when eliminating conditions
in FRE but did you re-evaluate what made you improve VN to see
if the cases are handled as expected now without this change?

I will still look at and consider the change btw, but given the EVRP
improvements I'm also considering to remove the predication
support from VN alltogether.  At least in the non-iterating mode
it should be trivially easy to use rangers relation oracle to simplify
predicates.  For the iterating mode it might not be 100% effective
since I'm not sure we can make it use the current SSA values and
how it would behave with those eventually changing to worse.

Andrew, how would one ask the relation oracle to simplify a
condition?  Do I have to do any bookkeeping to register
predicates on edges for it?

Thanks,
Richard.

> First, regarding the comments:
>
> > -Original Message-
> > From: Richard Biener 
> > Sent: Friday, October 1, 2021 9:00 PM
> > To: Di Zhao OS 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH v2] tree-optimization/101186 - extend FRE with
> > "equivalence map" for condition prediction
> >
> > On Thu, Sep 16, 2021 at 8:13 PM Di Zhao OS
> >  wrote:
> > >
> > > Sorry about updating on this after so long. It took me much time to work 
> > > out a
> > > new plan and pass the tests.
> > >
> > > The new idea is to use one variable to represent a set of equal variables 
> > > at
> > > some basic-block. This variable is called a "equivalence head" or 
> > > "equiv-head"
> > > in the code. (There's no-longer a "equivalence map".)
> > >
> > > - Initially an SSA_NAME's "equivalence head" is its value number. 
> > > Temporary
> > >   equivalence heads are recorded as unary NOP_EXPR results in the 
> > > vn_nary_op_t
> > >   map. Besides, when inserting into vn_nary_op_t map, make the new result 
> > > at
> > >   front of the vn_pval list, so that when searching for a variable's
> > >   equivalence head, the first result represents the largest equivalence 
> > > set at
> > >   current location.
> > > - In vn_ssa_aux_t, maintain a list of references to valid_info->nary 
> > > entry.
> > >   For recorded equivalences, the reference is result->entry; for normal 
> > > N-ary
> > >   operations, the reference is operand->entry.
> > > - When recording equivalences, if one side A is constant or has more 
> > > refs, make
> > >   it the new equivalence head of the other side B. Traverse B's ref-list, 
> > > if a
> > >   variable C's previous equiv-head is B, update to A. And re-insert B's 
> > > n-ary
> > >   operations by replacing B with A.
> > > - When inserting and looking for the results of n-ary operations, insert 
> > > and
> > >   lookup by the operands' equiv-heads.
> > > ...
> > >
> > > Thanks,
> > > Di Zhao
> > >
> > > 
> > > Extend FRE with temporary equivalences.
> >
> > Comments on the patch:
> >
> > +  /* nary_ref count.  */
> > +  unsigned num_nary_ref;
> > +
> >
> > I think a unsigned short should be enough and that would nicely
> > pack after value_id together with the bitfield (maybe change that
> > to unsigned short :1 then).
>
> Changed num_nary_ref to unsigned short and moved after value_id.
>
> > @@ -7307,17 +7839,23 @@ process_bb (rpo_elim &avail, basic_block bb,
> > tree val = gimple_simplify (gimple_cond_code (last),
> > boolean_type_node, lhs, rhs,
> > NULL, vn_valueize);
> > +   vn_nary_op_t vnresult = NULL;
> > /* If the condition didn't simplfy see if we have recorded
> >an expression from sofar taken edges.  */
> > if (! val || TREE_CODE (val) != INTEGER_CST)
> >   {
> > -   vn_nary_op_t vnresult;
> >
> > looks like you don't need vnresult outside of the if()?
>
> vnresult is reused later to record equivalences generated by PHI nodes.
>
> > +/* Find predicated value of vn_nary_op by the operands' equivalences.  
> > Return
> > + * NULL_TREE if no known result is found.  */
> > +
> > +static tree
> > +find_predicated_value_by_equivs (vn_nary_op_t vno, basic_block bb,
> > +vn_nary_op_t *vnresult)
> > +{
> > +  lookup_equiv_heads (vno->length, vno->op, vno->op, bb);
> > +  tree result
> > += simplify_nary_op (vno->length, vno->opcode, vno->op, vno->type);
> >
> > why is it necessary to simplify here?  It looks like the caller
> > already does this.
>
> In the new patch, changed the code a little to

Re: [PATCH] dwarf2out: Fix up field_byte_offset [PR101378]

2021-11-10 Thread Richard Biener via Gcc-patches

On Wed, 10 Nov 2021, Jakub Jelinek wrote:

> Hi!
> 
> For PCC_BITFIELD_TYPE_MATTERS field_byte_offset has quite large code
> to deal with it since many years ago (see it e.g. in GCC 3.2, although it
> used to be on HOST_WIDE_INTs, then on double_ints, now on offset_ints).
> But that code apparently isn't able to cope with members with empty class
> types with [[no_unique_address]] attribute, because the empty classes have
> non-zero type size but zero decl size and so one can end up from the
> computation with negative offset or offset 1 byte smaller than it should be.
> For !PCC_BITFIELD_TYPE_MATTERS, we just use
> tree_result = byte_position (decl);
> which seems exactly right even for the empty classes or anything which is
> not a bitfield (and for which we don't add DW_AT_bit_offset attribute).
> So, instead of trying to handle those no_unique_address members in the
> current already very complicated code, this limits it to bitfields.
> 
> stor-layout.c PCC_BITFIELD_TYPE_MATTERS handling also affects only
> bitfields, twice it checks DECL_BIT_FIELD and once DECL_BIT_FIELD_TYPE.
> 
> The only thing I'm unsure about is whether the test should be
> DECL_BIT_FIELD or DECL_BIT_FIELD_TYPE should be tested.  I thought it
> doesn't matter, but it seems stor-layout.c in some cases clears
> DECL_BIT_FIELD if their TYPE_MODE can express the type exactly, and
> dwarf2out.c (gen_field_die) uses
>   if (DECL_BIT_FIELD_TYPE (decl))
> to decide if DW_AT_bit_offset etc. attributes should be added.
> So maybe I should go with && DECL_BIT_FIELD_TYPE (decl) instead.

You need DECL_BIT_FIELD_TYPE if you want to know whether it is
a bitfield.  DECL_BIT_FIELD can only be used to test whether the
size is not a multiple of BITS_PER_UNIT.

So the question is whether the code makes a difference if
the bitfield is int a : 8; int b : 8; int c : 16; for example.
If so then DECL_BIT_FIELD_TYPE is needed, otherwise what you
test probably doesn't matter.

> On
> struct S { int e; int a : 1, b : 7, c : 8, d : 16; } s;
> struct T { int a : 1, b : 7; long long c : 8; int d : 16; } t;
> it doesn't make a difference though on x86_64, ppc64le nor ppc64...
> 
> I think Ada has bitfields of aggregate types, so CCing Eric, though
> I'd hope it doesn't have bitfields where type size is smaller than
> field decl size like C++ has.
> 
> Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux
> and powerpc64-linux and Pedro has tested it on GDB testsuite.
> 
> I can bootstrap/regtest the
> +  && DECL_BIT_FIELD_TYPE (decl)
> version too.
> 
> 2021-11-10  Jakub Jelinek  
> 
>   PR debug/101378
>   * dwarf2out.c (field_byte_offset): Do the PCC_BITFIELD_TYPE_MATTERS
>   handling only for DECL_BIT_FIELD decls.
> 
>   * g++.dg/debug/dwarf2/pr101378.C: New test.
> 
> --- gcc/dwarf2out.c.jj2021-11-05 10:19:46.339457342 +0100
> +++ gcc/dwarf2out.c   2021-11-09 15:01:51.425437717 +0100
> @@ -19646,6 +19646,7 @@ field_byte_offset (const_tree decl, stru
>   properly dynamic byte offsets only when PCC bitfield type doesn't
>   matter.  */
>if (PCC_BITFIELD_TYPE_MATTERS
> +  && DECL_BIT_FIELD (decl)
>&& TREE_CODE (DECL_FIELD_OFFSET (decl)) == INTEGER_CST)

What's more interesting is the INTEGER_CST restriction - I'm sure
that Ada allows bitfields to follow variable position other fields.
Even C does:

void foo (int n)
{
  struct S { int a[n]; int b : 5;  int c : 3; } s;
}

runs into the code above and ends up not honoring 
PCC_BITFIELD_TYPE_MATTERS ...

Richard.

>  {
>offset_int object_offset_in_bits;
> --- gcc/testsuite/g++.dg/debug/dwarf2/pr101378.C.jj   2021-11-09 
> 15:17:39.504975396 +0100
> +++ gcc/testsuite/g++.dg/debug/dwarf2/pr101378.C  2021-11-09 
> 15:17:28.067137556 +0100
> @@ -0,0 +1,13 @@
> +// PR debug/101378
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-gdwarf-5 -dA" }
> +// { dg-final { scan-assembler-times "0\[^0-9x\\r\\n\]* 
> DW_AT_data_member_location" 1 } }
> +// { dg-final { scan-assembler-times "1\[^0-9x\\r\\n\]* 
> DW_AT_data_member_location" 1 } }
> +// { dg-final { scan-assembler-times "2\[^0-9x\\r\\n\]* 
> DW_AT_data_member_location" 1 } }
> +// { dg-final { scan-assembler-not "-1\[^0-9x\\r\\n\]* 
> DW_AT_data_member_location" } }
> +
> +struct E {};
> +struct S
> +{
> +  [[no_unique_address]] E e, f, g;
> +} s;
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

Re: [PATCH] rs6000/doc: Rename future cpu with power10

2021-11-10 Thread Segher Boessenkool

Hi!

On Wed, Nov 10, 2021 at 05:39:27PM +0800, Kewen.Lin wrote:
> @@ -27779,10 +27779,10 @@ Enable/disable the @var{__float128} keyword for 
> IEEE 128-bit floating point
>  and use either software emulation for IEEE 128-bit floating point or
>  hardware instructions.
> 
> -The VSX instruction set (@option{-mvsx}, @option{-mcpu=power7},
> -@option{-mcpu=power8}), or @option{-mcpu=power9} must be enabled to
> -use the IEEE 128-bit floating point support.  The IEEE 128-bit
> -floating point support only works on PowerPC Linux systems.
> +The VSX instruction set (@option{-mvsx}, @option{-mcpu=power7} (or later
> +@var{cpu_type})) must be enabled to use the IEEE 128-bit floating point
> +support.  The IEEE 128-bit floating point support only works on PowerPC
> +Linux systems.

I'd just say -mvsx.  This is default on for -mcpu=power7 and later, and
cannot be enabled elsewhere, but that is beside the point.

If you say more than the essentials here it becomes harder to read
(simply because there is more to read then), harder to find what you
are looking for, and harder to keep it updated if things change (like
what this patch is for :-) )

The part about "works only on Linux" isn't quite true.  "Is only
supported on Linux" is a bit better.

>  Generate (do not generate) addressing modes using prefixed load and
> -store instructions when the option @option{-mcpu=future} is used.
> +store instructions.  The @option{-mprefixed} option requires that
> +the option @option{-mcpu=power10} (or later @var{cpu_type}) is enabled.

Just "or later" please.  The "CPU_TYPE" thing is local to the -mcpu=
description, let's not refer to it from elsewhere.

>  @item -mmma
>  @itemx -mno-mma
>  @opindex mmma
>  @opindex mno-mma
> -Generate (do not generate) the MMA instructions when the option
> -@option{-mcpu=future} is used.
> +Generate (do not generate) the MMA instructions.  The @option{-mma}
> +option requires that the option @option{-mcpu=power10} (or later
> +@var{cpu_type}) is enabled.

(once more)

Okay for trunk with those changes.  Thanks!


Segher

[PATCH] testsuite/102690 - XFAIL g++.dg/warn/Warray-bounds-16.C

2021-11-10 Thread Richard Biener via Gcc-patches

This XFAILs the bogus diagnostic test and rectifies the expectation
on the optimization.

Tested on x86_64-unknown-linux-gnu, pushed.

2021-11-10  Richard Biener  

PR testsuite/102690
* g++.dg/warn/Warray-bounds-16.C: XFAIL diagnostic part
and optimization.
---
 gcc/testsuite/g++.dg/warn/Warray-bounds-16.C | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/g++.dg/warn/Warray-bounds-16.C 
b/gcc/testsuite/g++.dg/warn/Warray-bounds-16.C
index 17b4d0d194e..89cbadb91c7 100644
--- a/gcc/testsuite/g++.dg/warn/Warray-bounds-16.C
+++ b/gcc/testsuite/g++.dg/warn/Warray-bounds-16.C
@@ -19,11 +19,11 @@ struct S
 p = (int*) new unsigned char [sizeof (int) * m];
 
 for (int i = 0; i < m; i++)
-  new (p + i) int ();
+  new (p + i) int (); /* { dg-bogus "bounds" "pr102690" { xfail *-*-* } } 
*/
   }
 };
 
 S a (0);
 
-/* Verify the loop has been eliminated.
-   { dg-final { scan-tree-dump-not "goto" "optimized" } } */
+/* The loop cannot be eliminated since the global 'new' can change 'm'.  */
+/* { dg-final { scan-tree-dump-not "goto" "optimized" { xfail *-*-* } } } */
-- 
2.31.1

[wwwdocs, patch] gcc-12/changes.html: Update OpenMP status

2021-11-10 Thread Tobias Burnus


Cumulative update of the OpenMP 5.x changes in GCC 12.

I hope it covers all essential changes. Of course,
some other could be added like 'omp target in_reduction',
which was missing before (oversight) and possibly other
things, which I have missed.

For the last bullet: the implementation-status documentation
is new – but even if it were not, linking to it makes sense.
I kept adding and removing a 'full' after the "The",
which sounds better but while the list in libgomp.texi is
extensive, it does not have "full" coverage of all changes.
Additionally, as it is new, a "now" could be added after
"can".

Suggestions, additions, wording changes?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
gcc-12/changes.html: Update OpenMP status

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 81f62fe3..bbb8f2ac 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -70,20 +70,36 @@ a work-in-progress.
 New Languages and Language specific improvements
 
 
-  OpenMP 5.0 support has been extended: The close map modifier
+  OpenMP
+  
+OpenMP 5.0 support has been extended: The close map modifier
   and the affinity clause are now supported and for Fortran
   additionally the following features which were available in C and C++
-  before:  depobj, mutexinoutset and
-   iterator can now also be used with the depend
-  clause, defaultmap has been updated for OpenMP 5.0, and the
-  loop directive and combined directives
-  involving master directive have been added. Additionally,
-  the following OpenMP 5.1 feature have been added: support for expressing
+  before: declare variant is now available,
+  depobj, mutexinoutset and iterator
+  can now also be used with the depend clause,
+  defaultmap has been updated for OpenMP 5.0, and the
+  loop directive and combined directives involving
+  master directive have been added.
+The following OpenMP 5.1 feature have been added: support for expressing
   OpenMP directives as C++ 11 attributes, the masked and
   scope construct, the nothing and
   error directives, and using primary with the
   proc_bind clause and OMP_PROC_BIND environment
-  variable.
+  variable, the reproducible and unconstrained
+  modifiers to the order clause, and, for C/C++ only, the
+  align- and allocate-modifiers to the allocate clause and
+  the atomic extensions are now available. The
+  OMP_PLACE environment variable supports the OpenMP 5.1
+  features and the OMP_NUM_TEAMS and
+  OMP_TEAMS_THREAD_LIMIT environement variables and their
+  associated API routines are now supported as well as the memory-allocation
+  routines added for Fortran and extended for C/C++ in OpenMP 5.1. In
+  Fortran code, strictly-structured blocks can be used.
+The https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Implementation-Status.html";
+  >OpenMP Implementation Status can be found in the libgomp manual.
+  
   
   The new warning flag -Wopenacc-parallelism was added for
   OpenACC. It warns about potentially suboptimal choices related to

[wwwdocs] projects/gomp/: Add OpenMP 5.2

2021-11-10 Thread Tobias Burnus


This is rather obvious – and unless there are comments, I intent to
commit it as obvious later today.

I think that's the only place to be updated on the webserver.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
projects/gomp/: Add OpenMP 5.2

diff --git a/htdocs/projects/gomp/index.html b/htdocs/projects/gomp/index.html
index ff14b2b2..59697c10 100644
--- a/htdocs/projects/gomp/index.html
+++ b/htdocs/projects/gomp/index.html
@@ -69,6 +69,10 @@ available.
 
 Status
 
+November 9, 2021
+https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5-2.pdf";>OpenMP
+Version 5.2 has been released.
+
 July 15, 2021
 https://www.openmp.org/wp-content/uploads/openmp-TR10.pdf";>OpenMP
 Technical Report 10: Version 5.2 Public Comment Draft has been released.

[PATCH v2] gimple-fold: Smarter optimization of _chk variants

2021-11-10 Thread Siddhesh Poyarekar

Instead of comparing LEN and SIZE only if they are constants, use their
ranges to decide if LEN will always be lower than or same as SIZE.

This change ends up putting the stringop-overflow warning line number
against the strcpy implementation, so adjust the warning check to be
line number agnostic.

gcc/ChangeLog:

* gimple-fold.c (known_safe): New function.
(gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk,
gimple_fold_builtin_snprintf_chk,
gimple_fold_builtin_sprintf_chk): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/Wobjsize-1.c: Make warning change line agnostic.
* gcc.dg/builtin-chk-fold.c: New test.

Signed-off-by: Siddhesh Poyarekar 
---
Changes from v1:
- Update comment that incorrectly said that known_safe emits a warning.
- Add tests for strncpy and snprintf too.

 gcc/gimple-fold.c   | 185 +---
 gcc/testsuite/gcc.dg/Wobjsize-1.c   |   5 +-
 gcc/testsuite/gcc.dg/builtin-chk-fold.c |  49 +++
 3 files changed, 121 insertions(+), 118 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-chk-fold.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 6e25a7c05db..36b06218c88 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2987,6 +2987,24 @@ gimple_fold_builtin_fputs (gimple_stmt_iterator *gsi,
   return false;
 }
 
+/* Return true if LEN is known to be less than or equal to SIZE at compile time
+   and false otherwise.  */
+
+static bool
+known_safe (gimple *stmt, tree len, tree size)
+{
+  if (len == NULL_TREE)
+return false;
+
+  wide_int size_range[2];
+  wide_int len_range[2];
+  if (get_range (len, stmt, len_range) && get_range (size, stmt, size_range)
+  && wi::leu_p (len_range[1], size_range[0]))
+return true;
+
+  return false;
+}
+
 /* Fold a call to the __mem{cpy,pcpy,move,set}_chk builtin.
DEST, SRC, LEN, and SIZE are the arguments to the call.
IGNORE is true, if return value can be ignored.  FCODE is the BUILT_IN_*
@@ -3024,39 +3042,24 @@ gimple_fold_builtin_memory_chk (gimple_stmt_iterator 
*gsi,
}
 }
 
-  if (! tree_fits_uhwi_p (size))
-return false;
-
   tree maxlen = get_maxval_strlen (len, SRK_INT_VALUE);
-  if (! integer_all_onesp (size))
+  if (! integer_all_onesp (size)
+  && !known_safe (stmt, len, size) && !known_safe (stmt, maxlen, size))
 {
-  if (! tree_fits_uhwi_p (len))
+  /* MAXLEN and LEN both cannot be proved to be less than SIZE, at
+least try to optimize (void) __mempcpy_chk () into
+(void) __memcpy_chk () */
+  if (fcode == BUILT_IN_MEMPCPY_CHK && ignore)
{
- /* If LEN is not constant, try MAXLEN too.
-For MAXLEN only allow optimizing into non-_ocs function
-if SIZE is >= MAXLEN, never convert to __ocs_fail ().  */
- if (maxlen == NULL_TREE || ! tree_fits_uhwi_p (maxlen))
-   {
- if (fcode == BUILT_IN_MEMPCPY_CHK && ignore)
-   {
- /* (void) __mempcpy_chk () can be optimized into
-(void) __memcpy_chk ().  */
- fn = builtin_decl_explicit (BUILT_IN_MEMCPY_CHK);
- if (!fn)
-   return false;
+ fn = builtin_decl_explicit (BUILT_IN_MEMCPY_CHK);
+ if (!fn)
+   return false;
 
- gimple *repl = gimple_build_call (fn, 4, dest, src, len, 
size);
- replace_call_with_call_and_fold (gsi, repl);
- return true;
-   }
- return false;
-   }
+ gimple *repl = gimple_build_call (fn, 4, dest, src, len, size);
+ replace_call_with_call_and_fold (gsi, repl);
+ return true;
}
-  else
-   maxlen = len;
-
-  if (tree_int_cst_lt (size, maxlen))
-   return false;
+  return false;
 }
 
   fn = NULL_TREE;
@@ -3126,61 +3129,47 @@ gimple_fold_builtin_stxcpy_chk (gimple_stmt_iterator 
*gsi,
   return true;
 }
 
-  if (! tree_fits_uhwi_p (size))
-return false;
-
   tree maxlen = get_maxval_strlen (src, SRK_STRLENMAX);
   if (! integer_all_onesp (size))
 {
   len = c_strlen (src, 1);
-  if (! len || ! tree_fits_uhwi_p (len))
+  if (!known_safe (stmt, len, size) && !known_safe (stmt, maxlen, size))
{
- /* If LEN is not constant, try MAXLEN too.
-For MAXLEN only allow optimizing into non-_ocs function
-if SIZE is >= MAXLEN, never convert to __ocs_fail ().  */
- if (maxlen == NULL_TREE || ! tree_fits_uhwi_p (maxlen))
+ if (fcode == BUILT_IN_STPCPY_CHK)
{
- if (fcode == BUILT_IN_STPCPY_CHK)
-   {
- if (! ignore)
-   return false;
-
- /* If return value of __stpcpy_chk is ignored,
-optimize into __strcpy_chk.  */
- fn = builtin_decl_explicit (BUILT_

[PATCH] lto-wrapper: fix memory corruption.

2021-11-10 Thread Martin Liška


Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?

The first argument of merge_and_complain is actually vector where
we merge options and it should be propagated to caller properly.

Fixes:

==6656== Invalid read of size 8
==6656==at 0x408056: merge_and_complain (lto-wrapper.c:335)
==6656==by 0x408056: find_and_merge_options(int, long, char const*, 
vec, vec*, char const*) (lto-wrapper.c:1139)
==6656==by 0x408AFC: run_gcc(unsigned int, char**) (lto-wrapper.c:1505)
==6656==by 0x4061A2: main (lto-wrapper.c:2138)
==6656==  Address 0x4e69b18 is 344 bytes inside a block of size 1,768 free'd
==6656==at 0x484339F: realloc (vg_replace_malloc.c:1192)
==6656==by 0x4993C0: xrealloc (xmalloc.c:181)
==6656==by 0x406A82: reserve (vec.h:290)
==6656==by 0x406A82: reserve (vec.h:1858)
==6656==by 0x406A82: vec::safe_push(cl_decoded_option const&) [clone .isra.0] (vec.h:1967)
==6656==by 0x4077E0: merge_and_complain (lto-wrapper.c:457)
==6656==by 0x4077E0: find_and_merge_options(int, long, char const*, 
vec, vec*, char const*) (lto-wrapper.c:1139)
==6656==by 0x408AFC: run_gcc(unsigned int, char**) (lto-wrapper.c:1505)
==6656==by 0x4061A2: main (lto-wrapper.c:2138)
==6656==  Block was alloc'd at
==6656==at 0x483E70F: malloc (vg_replace_malloc.c:380)
==6656==by 0x4993D7: xrealloc (xmalloc.c:179)
==6656==by 0x407476: reserve (vec.h:290)
==6656==by 0x407476: reserve (vec.h:1858)
==6656==by 0x407476: reserve_exact (vec.h:1878)
==6656==by 0x407476: create (vec.h:1893)
==6656==by 0x407476: get_options_from_collect_gcc_options(char const*, char 
const*) (lto-wrapper.c:163)
==6656==by 0x407674: find_and_merge_options(int, long, char const*, 
vec, vec*, char const*) (lto-wrapper.c:1132)
==6656==by 0x408AFC: run_gcc(unsigned int, char**) (lto-wrapper.c:1505)
==6656==by 0x4061A2: main (lto-wrapper.c:2138)

gcc/ChangeLog:

* lto-wrapper.c (merge_and_complain): Make the first argument
a reference type.
---
 gcc/lto-wrapper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 7b9e4883f38..54f642d7692 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -224,7 +224,7 @@ merge_flto_options (vec &decoded_options,
ontop of DECODED_OPTIONS.  */
 
 static void

-merge_and_complain (vec decoded_options,
+merge_and_complain (vec &decoded_options,
vec fdecoded_options,
vec decoded_cl_options)
 {
--
2.33.1

Re: [PATCH] dwarf2out: Fix up field_byte_offset [PR101378]

2021-11-10 Thread Jakub Jelinek via Gcc-patches

On Wed, Nov 10, 2021 at 10:52:42AM +0100, Richard Biener wrote:
> > The only thing I'm unsure about is whether the test should be
> > DECL_BIT_FIELD or DECL_BIT_FIELD_TYPE should be tested.  I thought it
> > doesn't matter, but it seems stor-layout.c in some cases clears
> > DECL_BIT_FIELD if their TYPE_MODE can express the type exactly, and
> > dwarf2out.c (gen_field_die) uses
> >   if (DECL_BIT_FIELD_TYPE (decl))
> > to decide if DW_AT_bit_offset etc. attributes should be added.
> > So maybe I should go with && DECL_BIT_FIELD_TYPE (decl) instead.
> 
> You need DECL_BIT_FIELD_TYPE if you want to know whether it is
> a bitfield.  DECL_BIT_FIELD can only be used to test whether the
> size is not a multiple of BITS_PER_UNIT.

Yeah, so I think DECL_BIT_FIELD_TYPE is the right test and I'll retest with
that.

> So the question is whether the code makes a difference if
> the bitfield is int a : 8; int b : 8; int c : 16; for example.
> If so then DECL_BIT_FIELD_TYPE is needed, otherwise what you
> test probably doesn't matter.

Apparently I made a mistake of trying those tests with -g -dA.
The problem is that for bitfields we then emit DW_AT_data_bit_offset
(which is already in DWARF4, but we chose to emit it only for DWARF5
as many consumers didn't handle it).
So, on
struct S { int e; int a : 1, b : 7, c : 8, d : 16; } s;
struct T { int a : 1, b : 7; long long c : 8; int d : 16; } t;

int
main ()
{
  s.c = 0x55;
  s.d = 0x;
  t.c = 0x55;
  t.d = 0x;
  s.e++;
}
the difference with the patch as posted and -gdwarf-4 -dA is:
.uleb128 0x4# (DIE (0x5f) DW_TAG_member)
.ascii "c\0"# DW_AT_name
.byte   0x1 # DW_AT_decl_file (hoho.C)
.byte   0x1 # DW_AT_decl_line
.byte   0x25# DW_AT_decl_column
.long   0x7c# DW_AT_type
.byte   0x4 # DW_AT_byte_size
.byte   0x8 # DW_AT_bit_size
-   .byte   0x10# DW_AT_bit_offset
-   .byte   0x4 # DW_AT_data_member_location
+   .byte   0x18# DW_AT_bit_offset
+   .byte   0x5 # DW_AT_data_member_location
.uleb128 0x4# (DIE (0x6d) DW_TAG_member)
.ascii "d\0"# DW_AT_name
.byte   0x1 # DW_AT_decl_file (hoho.C)
.byte   0x1 # DW_AT_decl_line
.byte   0x2c# DW_AT_decl_column
.long   0x7c# DW_AT_type
.byte   0x4 # DW_AT_byte_size
.byte   0x10# DW_AT_bit_size
-   .byte   0   # DW_AT_bit_offset
-   .byte   0x4 # DW_AT_data_member_location
+   .byte   0x10# DW_AT_bit_offset
+   .byte   0x6 # DW_AT_data_member_location
.byte   0   # end of children of DIE 0x2d
and
.uleb128 0x4# (DIE (0xbe) DW_TAG_member)
.ascii "c\0"# DW_AT_name
.byte   0x1 # DW_AT_decl_file (hoho.C)
.byte   0x2 # DW_AT_decl_line
.byte   0x28# DW_AT_decl_column
.long   0xdb# DW_AT_type
.byte   0x8 # DW_AT_byte_size
.byte   0x8 # DW_AT_bit_size
-   .byte   0x30# DW_AT_bit_offset
-   .byte   0   # DW_AT_data_member_location
+   .byte   0x38# DW_AT_bit_offset
+   .byte   0x1 # DW_AT_data_member_location
.uleb128 0x4# (DIE (0xcc) DW_TAG_member)
.ascii "d\0"# DW_AT_name
.byte   0x1 # DW_AT_decl_file (hoho.C)
.byte   0x2 # DW_AT_decl_line
.byte   0x33# DW_AT_decl_column
.long   0x7c# DW_AT_type
.byte   0x4 # DW_AT_byte_size
.byte   0x10# DW_AT_bit_size
-   .byte   0   # DW_AT_bit_offset
-   .byte   0   # DW_AT_data_member_location
+   .byte   0x10# DW_AT_bit_offset
+   .byte   0x2 # DW_AT_data_member_location
.byte   0   # end of children of DIE 0x97
but GDB handles both fine.

> >if (PCC_BITFIELD_TYPE_MATTERS
> > +  && DECL_BIT_FIELD (decl)
> >&& TREE_CODE (DECL_FIELD_OFFSET (decl)) == INTEGER_CST)
> 
> What's more interesting is the INTEGER_CST restriction - I'm sure
> that Ada allows bitfields to follow variable position other fields.
> Even C does:
> 
> void foo (int n)
> {
>   struct S { int a[n]; int b : 5;  int c : 3; } s;
> }
> 
> runs into the code above and ends up not honoring 
> PCC_BITFIELD_TYPE_MATTERS ...

True.  And, apparently we've been mishandling this forever.
The function actually doesn't and has never had any way to signal to caller
that it gave up and was unable to handle it correctly, before the addition
of dw_loc_descr_ref return the function was just returning HOST_WIDE_INT
and returned 0 both in cases where the field offset was really 0 and where
it gave up (e.g. because bit_position wasn't INTEGER_CST representable
in hwi).

I'm afraid I forgot the DECL_FIELD_OFFSET vs. DECL_FIELD_BIT_OFFSET stuff
enough that I'm not sure what is the right fix for that case, maybe it would
work if we dropped the && TREE_CODE (DECL_FIELD_OFFSET (decl)) == INTEGER_CST
check and instea

Re: [wwwdocs, patch] gcc-12/changes.html: Update OpenMP status

2021-11-10 Thread Jakub Jelinek via Gcc-patches

On Wed, Nov 10, 2021 at 11:33:09AM +0100, Tobias Burnus wrote:
> Cumulative update of the OpenMP 5.x changes in GCC 12.
> 
> I hope it covers all essential changes. Of course,
> some other could be added like 'omp target in_reduction',
> which was missing before (oversight) and possibly other
> things, which I have missed.
> 
> For the last bullet: the implementation-status documentation
> is new – but even if it were not, linking to it makes sense.
> I kept adding and removing a 'full' after the "The",
> which sounds better but while the list in libgomp.texi is
> extensive, it does not have "full" coverage of all changes.
> Additionally, as it is new, a "now" could be added after
> "can".
> 
> Suggestions, additions, wording changes?
> 
> Tobias
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

> gcc-12/changes.html: Update OpenMP status
> 
> diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
> index 81f62fe3..bbb8f2ac 100644
> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html
> @@ -70,20 +70,36 @@ a work-in-progress.
>  New Languages and Language specific improvements
>  
>  
> -  OpenMP 5.0 support has been extended: The close map 
> modifier
> +  OpenMP
> +  
> +OpenMP 5.0 support has been extended: The close map 
> modifier
>and the affinity clause are now supported and for Fortran
>additionally the following features which were available in C and C++
> -  before:  depobj, mutexinoutset and
> -   iterator can now also be used with the 
> depend
> -  clause, defaultmap has been updated for OpenMP 5.0, and 
> the
> -  loop directive and combined directives
> -  involving master directive have been added. Additionally,
> -  the following OpenMP 5.1 feature have been added: support for 
> expressing
> +  before: declare variant is now available,
> +  depobj, mutexinoutset and 
> iterator
> +  can now also be used with the depend clause,
> +  defaultmap has been updated for OpenMP 5.0, and the
> +  loop directive and combined directives involving
> +  master directive have been added.
> +The following OpenMP 5.1 feature have been added: support for 
> expressing
>OpenMP directives as C++ 11 attributes, the masked and
>scope construct, the nothing and
>error directives, and using primary with the
>proc_bind clause and OMP_PROC_BIND 
> environment
> -  variable.
> +  variable, the reproducible and unconstrained
> +  modifiers to the order clause, and, for C/C++ only, the
> +  align- and allocate-modifiers to the allocate clause and
> +  the atomic extensions are now available. The
> +  OMP_PLACE environment variable supports the OpenMP 5.1
> +  features and the OMP_NUM_TEAMS and
> +  OMP_TEAMS_THREAD_LIMIT environement variables and their

environment

> +  associated API routines are now supported as well as the 
> memory-allocation
> +  routines added for Fortran and extended for C/C++ in OpenMP 5.1. In
> +  Fortran code, strictly-structured blocks can be used.
> +The  +  
> href="https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Implementation-Status.html";
> +  >OpenMP Implementation Status can be found in the libgomp 
> manual.
> +  
>
>The new warning flag -Wopenacc-parallelism was added for
>OpenACC. It warns about potentially suboptimal choices related to

Otherwise LGTM.

Jakub

[PATCH][GCC] aarch64: Add new vector mode V8DI

2021-11-10 Thread Przemyslaw Wirkus via Gcc-patches

Hi,
This patch is adding new V8DI mode which will be used with new Armv8.7-A
LS64 extension intrinsics.

Regtested on aarch64-elf and no issues.

OK for master?

gcc/ChangeLog:

2021-11-10  Przemyslaw Wirkus  

* config/aarch64/aarch64-modes.def (VECTOR_MODE): New V8DI mode.
* config/aarch64/aarch64.c (aarch64_hard_regno_mode_ok): Handle
V8DImode.
* config/aarch64/iterators.md (define_mode_attr nunits): Add entry
for V8DI.

Kind regards,
Przemyslaw Wirkus

--- 

diff --git a/gcc/config/aarch64/aarch64-modes.def 
b/gcc/config/aarch64/aarch64-modes.def
index 
ac97d222789c6701d858c014736f8c211512a4d9..62595b8af6e1eea8fc769885bba9fe54f0a9ec05
 100644
--- a/gcc/config/aarch64/aarch64-modes.def
+++ b/gcc/config/aarch64/aarch64-modes.def
@@ -81,6 +81,11 @@ INT_MODE (OI, 32);
 INT_MODE (CI, 48);
 INT_MODE (XI, 64);
 
+/* V8DI mode.  */
+VECTOR_MODE_WITH_PREFIX (V, INT, DI, 8, 5); \
+  \
+  ADJUST_ALIGNMENT (V8DI, 8);
+
 /* Define Advanced SIMD modes for structures of 2, 3 and 4 d-registers.  */
 #define ADV_SIMD_D_REG_STRUCT_MODES(NVECS, VB, VH, VS, VD) \
   VECTOR_MODES_WITH_PREFIX (V##NVECS##x, INT, 8, 3); \
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
69f08052ce808c140ed2933ab6b2e2617ca6f669..0e102a83a8dc34e715fafb58169897b12c9b3a20
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3376,6 +3376,9 @@ aarch64_hard_regno_nregs (unsigned regno, machine_mode 
mode)
 static bool
 aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode)
 {
+  if (mode == V8DImode)
+return IN_RANGE (regno, R0_REGNUM, R23_REGNUM);
+
   if (GET_MODE_CLASS (mode) == MODE_CC)
 return regno == CC_REGNUM;
 
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 
bdc8ba3576cf2c9b4ae96b45a382234e4e25b13f..cea277f3a03cfd20178e51e6abd7e256e206299f
 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -1053,7 +1053,7 @@ (define_mode_attr vas [(DI "") (SI ".2s")])
 (define_mode_attr nunits [(V8QI "8") (V16QI "16")
  (V4HI "4") (V8HI "8")
  (V2SI "2") (V4SI "4")
-(V2DI "2")
+ (V2DI "2") (V8DI "8")
  (V4HF "4") (V8HF "8")
  (V4BF "4") (V8BF "8")
  (V2SF "2") (V4SF "4")

Re: [PATCH] lto-wrapper: fix memory corruption.

2021-11-10 Thread Richard Biener via Gcc-patches

On Wed, Nov 10, 2021 at 11:47 AM Martin Liška  wrote:
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK.

Is this also latent on branches?


> The first argument of merge_and_complain is actually vector where
> we merge options and it should be propagated to caller properly.
>
> Fixes:
>
> ==6656== Invalid read of size 8
> ==6656==at 0x408056: merge_and_complain (lto-wrapper.c:335)
> ==6656==by 0x408056: find_and_merge_options(int, long, char const*, 
> vec, vec vl_ptr>*, char const*) (lto-wrapper.c:1139)
> ==6656==by 0x408AFC: run_gcc(unsigned int, char**) (lto-wrapper.c:1505)
> ==6656==by 0x4061A2: main (lto-wrapper.c:2138)
> ==6656==  Address 0x4e69b18 is 344 bytes inside a block of size 1,768 free'd
> ==6656==at 0x484339F: realloc (vg_replace_malloc.c:1192)
> ==6656==by 0x4993C0: xrealloc (xmalloc.c:181)
> ==6656==by 0x406A82: reserve (vec.h:290)
> ==6656==by 0x406A82: reserve (vec.h:1858)
> ==6656==by 0x406A82: vec vl_ptr>::safe_push(cl_decoded_option const&) [clone .isra.0] (vec.h:1967)
> ==6656==by 0x4077E0: merge_and_complain (lto-wrapper.c:457)
> ==6656==by 0x4077E0: find_and_merge_options(int, long, char const*, 
> vec, vec vl_ptr>*, char const*) (lto-wrapper.c:1139)
> ==6656==by 0x408AFC: run_gcc(unsigned int, char**) (lto-wrapper.c:1505)
> ==6656==by 0x4061A2: main (lto-wrapper.c:2138)
> ==6656==  Block was alloc'd at
> ==6656==at 0x483E70F: malloc (vg_replace_malloc.c:380)
> ==6656==by 0x4993D7: xrealloc (xmalloc.c:179)
> ==6656==by 0x407476: reserve (vec.h:290)
> ==6656==by 0x407476: reserve (vec.h:1858)
> ==6656==by 0x407476: reserve_exact (vec.h:1878)
> ==6656==by 0x407476: create (vec.h:1893)
> ==6656==by 0x407476: get_options_from_collect_gcc_options(char const*, 
> char const*) (lto-wrapper.c:163)
> ==6656==by 0x407674: find_and_merge_options(int, long, char const*, 
> vec, vec vl_ptr>*, char const*) (lto-wrapper.c:1132)
> ==6656==by 0x408AFC: run_gcc(unsigned int, char**) (lto-wrapper.c:1505)
> ==6656==by 0x4061A2: main (lto-wrapper.c:2138)
>
> gcc/ChangeLog:
>
> * lto-wrapper.c (merge_and_complain): Make the first argument
> a reference type.
> ---
>   gcc/lto-wrapper.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
> index 7b9e4883f38..54f642d7692 100644
> --- a/gcc/lto-wrapper.c
> +++ b/gcc/lto-wrapper.c
> @@ -224,7 +224,7 @@ merge_flto_options (vec 
> &decoded_options,
>  ontop of DECODED_OPTIONS.  */
>
>   static void
> -merge_and_complain (vec decoded_options,
> +merge_and_complain (vec &decoded_options,
> vec fdecoded_options,
> vec decoded_cl_options)
>   {
> --
> 2.33.1
>

Re: [PATCH] dwarf2out: Fix up field_byte_offset [PR101378]

2021-11-10 Thread Richard Biener via Gcc-patches

On Wed, 10 Nov 2021, Jakub Jelinek wrote:

> On Wed, Nov 10, 2021 at 10:52:42AM +0100, Richard Biener wrote:
> > > The only thing I'm unsure about is whether the test should be
> > > DECL_BIT_FIELD or DECL_BIT_FIELD_TYPE should be tested.  I thought it
> > > doesn't matter, but it seems stor-layout.c in some cases clears
> > > DECL_BIT_FIELD if their TYPE_MODE can express the type exactly, and
> > > dwarf2out.c (gen_field_die) uses
> > >   if (DECL_BIT_FIELD_TYPE (decl))
> > > to decide if DW_AT_bit_offset etc. attributes should be added.
> > > So maybe I should go with && DECL_BIT_FIELD_TYPE (decl) instead.
> > 
> > You need DECL_BIT_FIELD_TYPE if you want to know whether it is
> > a bitfield.  DECL_BIT_FIELD can only be used to test whether the
> > size is not a multiple of BITS_PER_UNIT.
> 
> Yeah, so I think DECL_BIT_FIELD_TYPE is the right test and I'll retest with
> that.
> 
> > So the question is whether the code makes a difference if
> > the bitfield is int a : 8; int b : 8; int c : 16; for example.
> > If so then DECL_BIT_FIELD_TYPE is needed, otherwise what you
> > test probably doesn't matter.
> 
> Apparently I made a mistake of trying those tests with -g -dA.
> The problem is that for bitfields we then emit DW_AT_data_bit_offset
> (which is already in DWARF4, but we chose to emit it only for DWARF5
> as many consumers didn't handle it).
> So, on
> struct S { int e; int a : 1, b : 7, c : 8, d : 16; } s;
> struct T { int a : 1, b : 7; long long c : 8; int d : 16; } t;
> 
> int
> main ()
> {
>   s.c = 0x55;
>   s.d = 0x;
>   t.c = 0x55;
>   t.d = 0x;
>   s.e++;
> }
> the difference with the patch as posted and -gdwarf-4 -dA is:
> .uleb128 0x4# (DIE (0x5f) DW_TAG_member)
> .ascii "c\0"# DW_AT_name
> .byte   0x1 # DW_AT_decl_file (hoho.C)
> .byte   0x1 # DW_AT_decl_line
> .byte   0x25# DW_AT_decl_column
> .long   0x7c# DW_AT_type
> .byte   0x4 # DW_AT_byte_size
> .byte   0x8 # DW_AT_bit_size
> -   .byte   0x10# DW_AT_bit_offset
> -   .byte   0x4 # DW_AT_data_member_location
> +   .byte   0x18# DW_AT_bit_offset
> +   .byte   0x5 # DW_AT_data_member_location
> .uleb128 0x4# (DIE (0x6d) DW_TAG_member)
> .ascii "d\0"# DW_AT_name
> .byte   0x1 # DW_AT_decl_file (hoho.C)
> .byte   0x1 # DW_AT_decl_line
> .byte   0x2c# DW_AT_decl_column
> .long   0x7c# DW_AT_type
> .byte   0x4 # DW_AT_byte_size
> .byte   0x10# DW_AT_bit_size
> -   .byte   0   # DW_AT_bit_offset
> -   .byte   0x4 # DW_AT_data_member_location
> +   .byte   0x10# DW_AT_bit_offset
> +   .byte   0x6 # DW_AT_data_member_location
> .byte   0   # end of children of DIE 0x2d
> and
> .uleb128 0x4# (DIE (0xbe) DW_TAG_member)
> .ascii "c\0"# DW_AT_name
> .byte   0x1 # DW_AT_decl_file (hoho.C)
> .byte   0x2 # DW_AT_decl_line
> .byte   0x28# DW_AT_decl_column
> .long   0xdb# DW_AT_type
> .byte   0x8 # DW_AT_byte_size
> .byte   0x8 # DW_AT_bit_size
> -   .byte   0x30# DW_AT_bit_offset
> -   .byte   0   # DW_AT_data_member_location
> +   .byte   0x38# DW_AT_bit_offset
> +   .byte   0x1 # DW_AT_data_member_location
> .uleb128 0x4# (DIE (0xcc) DW_TAG_member)
> .ascii "d\0"# DW_AT_name
> .byte   0x1 # DW_AT_decl_file (hoho.C)
> .byte   0x2 # DW_AT_decl_line
> .byte   0x33# DW_AT_decl_column
> .long   0x7c# DW_AT_type
> .byte   0x4 # DW_AT_byte_size
> .byte   0x10# DW_AT_bit_size
> -   .byte   0   # DW_AT_bit_offset
> -   .byte   0   # DW_AT_data_member_location
> +   .byte   0x10# DW_AT_bit_offset
> +   .byte   0x2 # DW_AT_data_member_location
> .byte   0   # end of children of DIE 0x97
> but GDB handles both fine.
> 
> > >if (PCC_BITFIELD_TYPE_MATTERS
> > > +  && DECL_BIT_FIELD (decl)
> > >&& TREE_CODE (DECL_FIELD_OFFSET (decl)) == INTEGER_CST)
> > 
> > What's more interesting is the INTEGER_CST restriction - I'm sure
> > that Ada allows bitfields to follow variable position other fields.
> > Even C does:
> > 
> > void foo (int n)
> > {
> >   struct S { int a[n]; int b : 5;  int c : 3; } s;
> > }
> > 
> > runs into the code above and ends up not honoring 
> > PCC_BITFIELD_TYPE_MATTERS ...
> 
> True.  And, apparently we've been mishandling this forever.
> The function actually doesn't and has never had any way to signal to caller
> that it gave up and was unable to handle it correctly, before the addition
> of dw_loc_descr_ref return the function was just returning HOST_WIDE_INT
> and returned 0 both in cases where the field offset was really 0 and where
> it gave up (e.g. because bit_position wasn't INTEGER_

Re: [PATCH] dwarf2out: Fix up field_byte_offset [PR101378]

2021-11-10 Thread Jakub Jelinek via Gcc-patches

On Wed, Nov 10, 2021 at 12:36:04PM +0100, Richard Biener wrote:
> > I'm afraid I forgot the DECL_FIELD_OFFSET vs. DECL_FIELD_BIT_OFFSET stuff
> > enough that I'm not sure what is the right fix for that case, maybe it would
> > work if we dropped the && TREE_CODE (DECL_FIELD_OFFSET (decl)) == 
> > INTEGER_CST
> > check and instead used:
> > -  bitpos_int = wi::to_offset (DECL_FIELD_BIT_OFFSET (decl));
> > +  if (TREE_CODE (DECL_FIELD_OFFSET (decl)) == INTEGER_CST)
> > +   bitpos_int = wi::to_offset (bit_position (decl));
> > +  else
> > +   bitpos_int = wi::to_offset (DECL_FIELD_BIT_OFFSET (decl));
> > at the start and
> > -  if (ctx->variant_part_offset == NULL_TREE)
> > +  if (ctx->variant_part_offset == NULL_TREE
> > + && TREE_CODE (DECL_FIELD_OFFSET (decl)) == INTEGER_CST)
> >  {
> >*cst_offset = object_offset_in_bytes.to_shwi ();
> >return NULL;
> >  }
> >tree_result = wide_int_to_tree (sizetype, object_offset_in_bytes);
> > +  if (TREE_CODE (DECL_FIELD_OFFSET (decl)) == INTEGER_CST)
> > +   tree_result = size_binop (PLUS_EXPR, DECL_FIELD_OFFSET (decl),
> > + tree_result);
> 
> that needs multiplication by BITS_PER_UNIT.

I don't think so.
>From bit_from_pos and byte_from_pos functions it is clear that
DECL_FIELD_OFFSET is in bytes and DECL_FIELD_BIT_OFFSET in bits.
And object_offset_in_bytes is in bytes too.

> 
> > or so.
> 
> Not sure if it's worth special-casing constant DECL_FIELD_OFFSET
> though, or whether the code can just work with DECL_FIELD_BIT_OFFSET
> and DECL_FIELD_OFFSET be always added.

It has certainly the advantage that it doesn't change anything for the
most common case (where DECL_FIELD_OFFSET is INTEGER_CST).

Jakub

[PATCH][committed]middle-end vect: remove unused variable in complex numbers detection code.

2021-11-10 Thread Tamar Christina via Gcc-patches

Hi All,

This removed an unused variable that clang seems to catch when
compiling GCC with Clang.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Committed under the obvious rule.

Thanks,
Tamar

gcc/ChangeLog:

* tree-vect-slp-patterns.c (complex_mul_pattern::matches): Remove 
l1node.

--- inline copy of patch -- 
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index 
e08a15ebd92638ca32171b361412ec33f0116367..53fbe5185f543298b27d084a93c30a1c161d3d32
 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -946,7 +946,6 @@ complex_mul_pattern::matches (complex_operation_t op,
 
   auto childs = *ops;
   auto l0node = SLP_TREE_CHILDREN (childs[0]);
-  auto l1node = SLP_TREE_CHILDREN (childs[1]);
 
   bool mul0 = vect_match_expression_p (l0node[0], MULT_EXPR);
   bool mul1 = vect_match_expression_p (l0node[1], MULT_EXPR);


-- 
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index e08a15ebd92638ca32171b361412ec33f0116367..53fbe5185f543298b27d084a93c30a1c161d3d32 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -946,7 +946,6 @@ complex_mul_pattern::matches (complex_operation_t op,
 
   auto childs = *ops;
   auto l0node = SLP_TREE_CHILDREN (childs[0]);
-  auto l1node = SLP_TREE_CHILDREN (childs[1]);
 
   bool mul0 = vect_match_expression_p (l0node[0], MULT_EXPR);
   bool mul1 = vect_match_expression_p (l0node[1], MULT_EXPR);

[PATCH][committed]middle-end: Fix signbit tests when ran on ISA with support for masks.

2021-11-10 Thread Tamar Christina via Gcc-patches

Hi All,

These test don't work on vector ISAs where the truth
type don't match the vector mode of the operation.

However I still want the tests to run on these
architectures but just turn off the ISA modes that
enable masks.

This thus turns off SVE is it's on and turns off
AVX512 if it's on.

Regtested on aarch64-none-linux-gnu with SVE on,
and x86_64-pc-linux-gnu with AVX512 on and no
issues.

Committed under the obvious rule.

Thanks,
Tamar

gcc/testsuite/ChangeLog:

* gcc.dg/signbit-2.c: Turn of masks.
* gcc.dg/signbit-5.c: Likewise.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/signbit-2.c b/gcc/testsuite/gcc.dg/signbit-2.c
index 
fc0157cbc5c7996b481f2998bc30176c96a669bb..d8501e9b7a2d82b511ad0b3a44c0121d635972c0
 100644
--- a/gcc/testsuite/gcc.dg/signbit-2.c
+++ b/gcc/testsuite/gcc.dg/signbit-2.c
@@ -1,6 +1,10 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 --save-temps -fdump-tree-optimized" } */
 
+/* This test does not work when the truth type does not match vector type.  */
+/* { dg-additional-options "-mno-avx512f" { target { i?86-*-* x86_64-*-* } } } 
*/
+/* { dg-additional-options "-march=armv8-a" { target aarch64_sve } } */
+
 #include 
 
 void fun1(int32_t *x, int n)
@@ -15,5 +19,5 @@ void fun2(int32_t *x, int n)
   x[i] = (-x[i]) >> 30;
 }
 
-/* { dg-final { scan-tree-dump-times {\s+>\s+\{ 0, 0, 0, 0 \}} 1 optimized } } 
*/
+/* { dg-final { scan-tree-dump {\s+>\s+\{ 0, 0, 0(, 0)+ \}} optimized } } */
 /* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */
diff --git a/gcc/testsuite/gcc.dg/signbit-5.c b/gcc/testsuite/gcc.dg/signbit-5.c
index 
22a92704773e3282759524b74d35196a477d43dd..2b119cdfda7d2888f914633c809b0aa7da5244b7
 100644
--- a/gcc/testsuite/gcc.dg/signbit-5.c
+++ b/gcc/testsuite/gcc.dg/signbit-5.c
@@ -1,6 +1,11 @@
 /* { dg-do run } */
 /* { dg-options "-O3" } */
 
+/* This test does not work when the truth type does not match vector type.  */
+/* { dg-additional-options "-mno-avx512f" { target { i?86-*-* x86_64-*-* } } } 
*/
+/* { dg-additional-options "-march=armv8-a" { target aarch64_sve } } */
+
 #include 
 #include 
 #include 


-- 
diff --git a/gcc/testsuite/gcc.dg/signbit-2.c b/gcc/testsuite/gcc.dg/signbit-2.c
index fc0157cbc5c7996b481f2998bc30176c96a669bb..d8501e9b7a2d82b511ad0b3a44c0121d635972c0 100644
--- a/gcc/testsuite/gcc.dg/signbit-2.c
+++ b/gcc/testsuite/gcc.dg/signbit-2.c
@@ -1,6 +1,10 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 --save-temps -fdump-tree-optimized" } */
 
+/* This test does not work when the truth type does not match vector type.  */
+/* { dg-additional-options "-mno-avx512f" { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-additional-options "-march=armv8-a" { target aarch64_sve } } */
+
 #include 
 
 void fun1(int32_t *x, int n)
@@ -15,5 +19,5 @@ void fun2(int32_t *x, int n)
   x[i] = (-x[i]) >> 30;
 }
 
-/* { dg-final { scan-tree-dump-times {\s+>\s+\{ 0, 0, 0, 0 \}} 1 optimized } } */
+/* { dg-final { scan-tree-dump {\s+>\s+\{ 0, 0, 0(, 0)+ \}} optimized } } */
 /* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */
diff --git a/gcc/testsuite/gcc.dg/signbit-5.c b/gcc/testsuite/gcc.dg/signbit-5.c
index 22a92704773e3282759524b74d35196a477d43dd..2b119cdfda7d2888f914633c809b0aa7da5244b7 100644
--- a/gcc/testsuite/gcc.dg/signbit-5.c
+++ b/gcc/testsuite/gcc.dg/signbit-5.c
@@ -1,6 +1,11 @@
 /* { dg-do run } */
 /* { dg-options "-O3" } */
 
+/* This test does not work when the truth type does not match vector type.  */
+/* { dg-additional-options "-mno-avx512f" { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-additional-options "-march=armv8-a" { target aarch64_sve } } */
+
+
 #include 
 #include 
 #include

Extend modref by side-effect analysis

2021-11-10 Thread Jan Hubicka via Gcc-patches

Hi,
this patch makes modref to also collect info whether function has side
effects.  This allows pure/const function detection and also handling
functions which do store some memory in similar way as we handle
pure/consts now.

The code is symmetric to what ipa-pure-const does.  Modref is actually more
capable on proving that a given function is pure/const (since it understands
that non-pure function can be called when it only modifies data on stack)
so we could retire ipa-pure-const's pure-const discovery at some point.

However this patch only does the anlaysis - the consumers of this flag
will come next.

Bootstrapped/regtested x86_64-linux. I plan to commit it later today
if there are no complains.

gcc/ChangeLog:

* ipa-modref.c: Include tree-eh.h
(modref_summary::modref_summary): Initialize side_effects.
(struct modref_summary_lto): New bool field side_effects.
(modref_summary_lto::modref_summary_lto): Initialize side_effects.
(modref_summary::dump): Dump side_effects.
(modref_summary_lto::dump): Dump side_effects.
(merge_call_side_effects): Merge side effects.
(process_fnspec): Calls to non-const/pure or looping
function is a side effect.
(analyze_call): Self-recursion is a side-effect; handle
special builtins.
(analyze_load): Watch for volatile and throwing memory.
(analyze_store): Likewise.
(analyze_stmt): Watch for volatitle asm.
(analyze_function): Handle side_effects.
(modref_summaries::duplicate): Duplicate side_effects.
(modref_summaries_lto::duplicate): Likewise.
(modref_write): Stream side_effects.
(read_section): Likewise.
(update_signature): Update.
(propagate_unknown_call): Handle side_effects.
(modref_propagate_in_scc): Likewise.
* ipa-modref.h (struct modref_summary): Add side_effects.
* ipa-pure-const.c (special_builtin_state): Rename to ...
(builtin_safe_for_const_function_p): ... this one.
(check_call): Update.
(finite_function_p): Break out from ...
(propagate_pure_const): ... here
* ipa-utils.h (finite_function): Declare.

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index 22efc06c583..d14f9e52f62 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssanames.h"
 #include "attribs.h"
 #include "tree-cfg.h"
+#include "tree-eh.h"
 
 
 namespace {
@@ -273,7 +274,7 @@ static GTY(()) fast_function_summary 
 
 modref_summary::modref_summary ()
   : loads (NULL), stores (NULL), retslot_flags (0), static_chain_flags (0),
-writes_errno (false)
+writes_errno (false), side_effects (false)
 {
 }
 
@@ -371,6 +372,7 @@ struct GTY(()) modref_summary_lto
   eaf_flags_t retslot_flags;
   eaf_flags_t static_chain_flags;
   bool writes_errno;
+  bool side_effects;
 
   modref_summary_lto ();
   ~modref_summary_lto ();
@@ -382,7 +384,7 @@ struct GTY(()) modref_summary_lto
 
 modref_summary_lto::modref_summary_lto ()
   : loads (NULL), stores (NULL), retslot_flags (0), static_chain_flags (0),
-writes_errno (false)
+writes_errno (false), side_effects (false)
 {
 }
 
@@ -615,6 +617,8 @@ modref_summary::dump (FILE *out)
 }
   if (writes_errno)
 fprintf (out, "  Writes errno\n");
+  if (side_effects)
+fprintf (out, "  Side effects\n");
   if (arg_flags.length ())
 {
   for (unsigned int i = 0; i < arg_flags.length (); i++)
@@ -647,6 +651,8 @@ modref_summary_lto::dump (FILE *out)
   dump_lto_records (stores, out);
   if (writes_errno)
 fprintf (out, "  Writes errno\n");
+  if (side_effects)
+fprintf (out, "  Side effects\n");
   if (arg_flags.length ())
 {
   for (unsigned int i = 0; i < arg_flags.length (); i++)
@@ -980,6 +986,12 @@ merge_call_side_effects (modref_summary *cur_summary,
  changed = true;
}
 }
+  if (!cur_summary->side_effects
+  && callee_summary->side_effects)
+{
+  cur_summary->side_effects = true;
+  changed = true;
+}
   return changed;
 }
 
@@ -1075,6 +1087,18 @@ process_fnspec (modref_summary *cur_summary,
gcall *call, bool ignore_stores)
 {
   attr_fnspec fnspec = gimple_call_fnspec (call);
+  int flags = gimple_call_flags (call);
+
+  if (!(flags & (ECF_CONST | ECF_NOVOPS))
+  || (flags & ECF_LOOPING_CONST_OR_PURE)
+  || (cfun->can_throw_non_call_exceptions
+ && stmt_could_throw_p (cfun, call)))
+{
+  if (cur_summary)
+   cur_summary->side_effects = true;
+  if (cur_summary_lto)
+   cur_summary_lto->side_effects = true;
+}
   if (!fnspec.known_p ())
 {
   if (dump_file && gimple_call_builtin_p (call, BUILT_IN_NORMAL))
@@ -1212,6 +1236,10 @@ analyze_call (modref_summary *cur_summary, 
modref_summary_lto *cur_summary_lto,
   if (recursive_call_p (current_function_decl, callee))
 {
   recursive_calls->safe_pus

Re: [PATH][_GLIBCXX_DEBUG] Fix unordered container merge

2021-11-10 Thread Jonathan Wakely via Gcc-patches

On Tue, 9 Nov 2021 at 16:25, Jonathan Wakely  wrote:

>
>
> On Mon, 8 Nov 2021 at 21:36, François Dumont  wrote:
>
>> Yet another version this time with only 1 guard implementation. The
>> predicate to invalidate the safe iterators has been externalized.
>>
>> Ok to commit ?
>>
>
> I like this version a lot - thanks for persisting with it.
>
>

I'm seeing new failures with this:

make check RUNTESTFLAGS="conformance.exp=23_containers/*/invalidation/*
--target_board=unix/-D_GLIBCXX_DEBUG/-std=gnu++98"

FAIL: 23_containers/deque/debug/invalidation/1.cc (test for excess errors)
FAIL: 23_containers/list/debug/invalidation/1.cc (test for excess errors)
FAIL: 23_containers/map/debug/invalidation/1.cc (test for excess errors)
FAIL: 23_containers/multimap/debug/invalidation/1.cc (test for excess
errors)
FAIL: 23_containers/multiset/debug/invalidation/1.cc (test for excess
errors)
FAIL: 23_containers/set/debug/invalidation/1.cc (test for excess errors)
FAIL: 23_containers/vector/debug/invalidation/1.cc (test for excess errors)

[PATCH][committed][testsuite]: change vect_long to vect_long_long in complex tests.

2021-11-10 Thread Tamar Christina via Gcc-patches

Hi All,

These tests are still failing on SPARC and it looks like this is because I need
to use vect_long_long instead of vect_long.

Committed under the obvious rule.

Thanks,
Tamar

gcc/testsuite/ChangeLog:

PR testsuite/103042
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c: Use
vect_long_long instead of vect_long.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c:
Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-long.c: Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c:
Likewise.

--- inline copy of patch -- 
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c 
b/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c
index 
6ed561a00416b5aef2e514820ab94489df7aa86b..0d21f57666ed9ff918dad343cbe53fbbdd271630
 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_complex_add_long } */
-/* { dg-require-effective-target vect_long } */
 /* { dg-require-effective-target stdint_types } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
 
@@ -13,6 +12,6 @@
 
 /* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT90" 1 "slp1" { 
target { vect_complex_add_long } } } } */
 /* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT270" 1 "slp1" { 
target { vect_complex_add_long } && ! target { aarch64_sve2 } } } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT270" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT90" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT90" "slp1" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT270" "slp1" { target { 
vect_long_long } } } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT90" "slp1" { target { 
vect_long_long } } } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT90" "slp1" { target { 
vect_long_long } } } } */
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c 
b/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c
index 
4976bf5e50d493f0ec733d783dd4b4b7902d05ed..38aa9c0b9d51d38e5c28c1e81ef082e0c568c8f8
 100644
--- 
a/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c
+++ 
b/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target stdint_types } */
-/* { dg-require-effective-target vect_long } */
 /* { dg-additional-options "-fno-tree-loop-vectorize" } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
 
@@ -13,5 +12,5 @@
 
 /* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT90" 1 "slp1" { 
target { vect_complex_add_long } } } } */
 /* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT270" 1 "slp1" { 
target { vect_complex_add_long } && ! target { aarch64_sve2 } } } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT270" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT90" "slp1" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT270" "slp1" { target { 
vect_long_long } } } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT90" "slp1" { target { 
vect_long_long } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/vect-complex-add-pattern-long.c 
b/gcc/testsuite/gcc.dg/vect/complex/vect-complex-add-pattern-long.c
index 
11a6f53ccebe685058fd486be06b9f3574531f86..70977155256bf6e414f2fa5a604b0e301496a2e6
 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/vect-complex-add-pattern-long.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/vect-complex-add-pattern-long.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target stdint_types } */
-/* { dg-require-effective-target vect_long } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
 
 #define UNROLL
@@ -12,5 +11,5 @@
 
 /* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT90" 1 "vect" { 
target { vect_complex_add_long } } } } */
 /* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT270" 1 "vect" { 
target { vect_complex_add_long } } } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT270" "vect" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT90" "vect" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT270" "vect" { target { 
vect_long_long } } } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT90" "vect" { target { 
vect_long_long } } } } */
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c 
b/gcc/testsuite/gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c
index 
796dabdb9ba9fbb2b9f1242e0f8e0efe52614f8b..7708ac495b8b8a626d2974f3e7070892adc0d8b1
 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c
+++ b/gcc/test

[PATCH] Apply TLC to control dependence compute

2021-11-10 Thread Richard Biener via Gcc-patches

This makes the control dependence compute avoid a find_edge
and optimizes allocation by embedding the bitmap head into the
vector of control dependences instead of allocating all of them.
It also uses a local bitmap obstack.

The bitmap changes make it necessary to shuffle some includes.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

2021-11-10  Richard Biener  

* cfganal.h (control_dependences::control_dependence_map):
Embed bitmap_head.
(control_dependences::m_bitmaps): New.
* cfganal.c (control_dependences::set_control_dependence_map_bit):
Adjust.
(control_dependences::clear_control_dependence_bitmap):
Likewise.
(control_dependences::find_control_dependence): Do not
find_edge for the abnormal edge test.
(control_dependences::control_dependences): Instead do not
add abnormal edges to the edge list.  Adjust.
(control_dependences::~control_dependences): Likewise.
(control_dependences::get_edges_dependent_on): Likewise.
* function-tests.c: Include bitmap.h.

gcc/analyzer/
* supergraph.cc: Include bitmap.h.

gcc/c/
* gimple-parser.c: Shuffle bitmap.h include.
---
 gcc/analyzer/supergraph.cc |  1 +
 gcc/c/gimple-parser.c  |  2 +-
 gcc/cfganal.c  | 32 ++--
 gcc/cfganal.h  |  3 ++-
 gcc/function-tests.c   |  1 +
 5 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/gcc/analyzer/supergraph.cc b/gcc/analyzer/supergraph.cc
index 85acf44d045..be8cec32327 100644
--- a/gcc/analyzer/supergraph.cc
+++ b/gcc/analyzer/supergraph.cc
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "graphviz.h"
 #include "cgraph.h"
 #include "tree-dfa.h"
+#include "bitmap.h"
 #include "cfganal.h"
 #include "function.h"
 #include "json.h"
diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index f3d99355a8e..32f22dbb8a7 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -56,13 +56,13 @@ along with GCC; see the file COPYING3.  If not see
 #include "internal-fn.h"
 #include "cfg.h"
 #include "cfghooks.h"
+#include "bitmap.h"
 #include "cfganal.h"
 #include "tree-cfg.h"
 #include "gimple-iterator.h"
 #include "cfgloop.h"
 #include "tree-phinodes.h"
 #include "tree-into-ssa.h"
-#include "bitmap.h"
 
 
 /* GIMPLE parser state.  */
diff --git a/gcc/cfganal.c b/gcc/cfganal.c
index cec5abe30f9..11ab23623ae 100644
--- a/gcc/cfganal.c
+++ b/gcc/cfganal.c
@@ -362,14 +362,14 @@ control_dependences::set_control_dependence_map_bit 
(basic_block bb,
   if (bb == ENTRY_BLOCK_PTR_FOR_FN (cfun))
 return;
   gcc_assert (bb != EXIT_BLOCK_PTR_FOR_FN (cfun));
-  bitmap_set_bit (control_dependence_map[bb->index], edge_index);
+  bitmap_set_bit (&control_dependence_map[bb->index], edge_index);
 }
 
 /* Clear all control dependences for block BB.  */
 void
 control_dependences::clear_control_dependence_bitmap (basic_block bb)
 {
-  bitmap_clear (control_dependence_map[bb->index]);
+  bitmap_clear (&control_dependence_map[bb->index]);
 }
 
 /* Find the immediate postdominator PDOM of the specified basic block BLOCK.
@@ -402,13 +402,6 @@ control_dependences::find_control_dependence (int 
edge_index)
 
   gcc_assert (get_edge_src (edge_index) != EXIT_BLOCK_PTR_FOR_FN (cfun));
 
-  /* For abnormal edges, we don't make current_block control
- dependent because instructions that throw are always necessary
- anyway.  */
-  edge e = find_edge (get_edge_src (edge_index), get_edge_dest (edge_index));
-  if (e->flags & EDGE_ABNORMAL)
-return;
-
   if (get_edge_src (edge_index) == ENTRY_BLOCK_PTR_FOR_FN (cfun))
 ending_block = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
   else
@@ -440,11 +433,23 @@ control_dependences::control_dependences ()
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun),
  EXIT_BLOCK_PTR_FOR_FN (cfun), next_bb)
 FOR_EACH_EDGE (e, ei, bb->succs)
-  m_el.quick_push (std::make_pair (e->src->index, e->dest->index));
+  {
+   /* For abnormal edges, we don't make current_block control
+  dependent because instructions that throw are always necessary
+  anyway.  */
+   if (e->flags & EDGE_ABNORMAL)
+ {
+   num_edges--;
+   continue;
+ }
+   m_el.quick_push (std::make_pair (e->src->index, e->dest->index));
+  }
 
+  bitmap_obstack_initialize (&m_bitmaps);
   control_dependence_map.create (last_basic_block_for_fn (cfun));
+  control_dependence_map.quick_grow (last_basic_block_for_fn (cfun));
   for (int i = 0; i < last_basic_block_for_fn (cfun); ++i)
-control_dependence_map.quick_push (BITMAP_ALLOC (NULL));
+bitmap_initialize (&control_dependence_map[i], &m_bitmaps);
   for (int i = 0; i < num_edges; ++i)
 find_control_dependence (i);
 
@@ -455,10 +460,9 @@ control_dependences::control_dependences ()
 
 control_dependences::~control_dependences ()
 {
-  for (unsigned i = 0

[committed] libstdc++: Disable gthreads weak symbols for glibc 2.34 [PR103133]

2021-11-10 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, glibc 2.17 and 2.34, pushed to trunk.

We also want something like this for musl-based targets. And Florian has
suggested we should also disable the weak symbols for libstdc++.a, but
those need more work.

...

Since Glibc 2.34 all pthreads symbols are defined directly in libc not
libpthread, and since Glibc 2.32 we have used __libc_single_threaded to
avoid unnecessary locking in single-threaded programs. This means there
is no reason to avoid linking to libpthread now, and so no reason to use
weak symbols defined in gthr-posix.h for all the pthread_xxx functions.

libstdc++-v3/ChangeLog:

PR libstdc++/100748
PR libstdc++/103133
* config/os/gnu-linux/os_defines.h (_GLIBCXX_GTHREAD_USE_WEAK):
Define for glibc 2.34 and later.
---
 libstdc++-v3/config/os/gnu-linux/os_defines.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/config/os/gnu-linux/os_defines.h 
b/libstdc++-v3/config/os/gnu-linux/os_defines.h
index d5bb2a1886e..3a053454195 100644
--- a/libstdc++-v3/config/os/gnu-linux/os_defines.h
+++ b/libstdc++-v3/config/os/gnu-linux/os_defines.h
@@ -61,4 +61,10 @@
   (__gthread_active_p() ? __gthread_self() : (__gthread_t)1)
 #endif
 
+#if __GLIBC_PREREQ(2, 34)
+// Since glibc 2.34 all pthreads functions are usable without linking to
+// libpthread.
+# define _GLIBCXX_GTHREAD_USE_WEAK 0
+#endif
+
 #endif
-- 
2.31.1

[committed] libstdc++: Fix test for libstdc++ not including [PR100117]

2021-11-10 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.


The  headers for the C library are not under our control, so we
can't prevent them from including . Change the PR 49745 test
to only include the C++ library headers, not the  ones.

To ensure  isn't included automatically we need to use
no_pch to disable PCH.

libstdc++-v3/ChangeLog:

PR libstdc++/100117
* testsuite/17_intro/headers/c++1998/49745.cc: Explicitly list
all C++ headers instead of including 
---
 .../17_intro/headers/c++1998/49745.cc | 113 +-
 1 file changed, 112 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++1998/49745.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++1998/49745.cc
index 204975e0316..f98fa3b4fb9 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++1998/49745.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++1998/49745.cc
@@ -1,4 +1,5 @@
 // { dg-do compile { target *-*-linux* *-*-gnu* } }
+// { dg-add-options no_pch }
 
 // Copyright (C) 2011-2021 Free Software Foundation, Inc.
 //
@@ -18,7 +19,117 @@
 // .
 
 // libstdc++/49745
-#include 
+// error: 'int truncate' redeclared as different kind of symbol
+
+// This tests that no libstdc++ headers include .
+// However, as discussed in PR libstdc++/100117 we cannot guarantee that
+// no libc headers include it indirectly, so only include the C++ headers.
+// That means we can't use  because that includes  etc.
+// so we list the C++ headers explicitly here. This list is unfortunately
+// doomed to get out of date as new headers are added to the library.
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#if __cplusplus >= 201103L
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#endif
+
+#if __cplusplus >= 201402L
+#include 
+#endif
+
+#if __cplusplus >= 201703L
+#include 
+#include 
+// #include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#endif
+
+#if __cplusplus >= 202002L
+#include 
+#include 
+#include 
+#include 
+#if __cpp_impl_coroutine
+# include 
+#endif
+#if __has_include ()
+# include 
+#endif
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#endif
+
+#if __cplusplus > 202002L
+#if __has_include()
+# include 
+#endif
+#if __has_include()
+# include 
+#endif
+#endif
+
 int truncate = 0;
 
 // { dg-xfail-if "PR libstdc++/5" { c++20 } }
-- 
2.31.1

[PATCH]AArch64 Remove shuffle pattern for rounding variant.

2021-11-10 Thread Tamar Christina via Gcc-patches

Hi All,

This removed the patterns to optimize the rounding shift and narrow.
The optimization is valid only for the truncating rounding shift and narrow,
for the rounding shift and narrow we need a different pattern that I will submit
separately.

This wasn't noticed before as the benchmarks did not run conformance as part of
the run, which we now do and this now passes again.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (*aarch64_topbits_shuffle_le
,*aarch64_topbits_shuffle_be): Remove.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/shrn-combine-8.c: Update.
* gcc.target/aarch64/shrn-combine-9.c: Update.

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
bff76e4b6e97db2613ab0ce1d721bf1828f0671b..c71658e2bf52b26bf9fc9fa702dd5446447f4d43
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1887,22 +1887,6 @@ (define_insn "*aarch64_topbits_shuffle_le"
   [(set_attr "type" "neon_permute")]
 )
 
-(define_insn "*aarch64_topbits_shuffle_le"
-  [(set (match_operand: 0 "register_operand" "=w")
-   (vec_concat:
-  (unspec: [
-  (match_operand:VQN 1 "register_operand" "w")
- (match_operand:VQN 2 "aarch64_simd_shift_imm_vec_exact_top")
-] UNSPEC_RSHRN)
- (unspec: [
- (match_operand:VQN 3 "register_operand" "w")
- (match_dup 2)
-] UNSPEC_RSHRN)))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "uzp2\\t%0., %1., %3."
-  [(set_attr "type" "neon_permute")]
-)
-
 (define_insn "*aarch64_topbits_shuffle_be"
   [(set (match_operand: 0 "register_operand" "=w")
(vec_concat:
@@ -1917,22 +1901,6 @@ (define_insn "*aarch64_topbits_shuffle_be"
   [(set_attr "type" "neon_permute")]
 )
 
-(define_insn "*aarch64_topbits_shuffle_be"
-  [(set (match_operand: 0 "register_operand" "=w")
-   (vec_concat:
- (unspec: [
- (match_operand:VQN 3 "register_operand" "w")
- (match_operand:VQN 2 "aarch64_simd_shift_imm_vec_exact_top")
-] UNSPEC_RSHRN)
-  (unspec: [
-  (match_operand:VQN 1 "register_operand" "w")
- (match_dup 2)
-] UNSPEC_RSHRN)))]
-  "TARGET_SIMD && BYTES_BIG_ENDIAN"
-  "uzp2\\t%0., %1., %3."
-  [(set_attr "type" "neon_permute")]
-)
-
 (define_expand "aarch64_shrn"
   [(set (match_operand: 0 "register_operand")
(truncate:
diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine-8.c 
b/gcc/testsuite/gcc.target/aarch64/shrn-combine-8.c
index 
6a47f3cdaee399e603c57a1c6a0c09c6cfd21abb..c93c179632156c07f05e6067e63804db35cc436b
 100644
--- a/gcc/testsuite/gcc.target/aarch64/shrn-combine-8.c
+++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine-8.c
@@ -6,7 +6,7 @@
 
 uint8x16_t foo (uint16x8_t a, uint16x8_t b)
 {
-  return vrshrn_high_n_u16 (vrshrn_n_u16 (a, 8), b, 8);
+  return vshrn_high_n_u16 (vshrn_n_u16 (a, 8), b, 8);
 }
 
 /* { dg-final { scan-assembler-times {\tuzp2\t} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine-9.c 
b/gcc/testsuite/gcc.target/aarch64/shrn-combine-9.c
index 
929a55c5c338844e6a5c5ad249af482286ab9c61..bdb3c13e5a2f89d62b6a24c2abe3535656399cac
 100644
--- a/gcc/testsuite/gcc.target/aarch64/shrn-combine-9.c
+++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine-9.c
@@ -6,7 +6,7 @@
 
 uint16x8_t foo (uint32x4_t a, uint32x4_t b)
 {
-  return vrshrn_high_n_u32 (vrshrn_n_u32 (a, 16), b, 16);
+  return vshrn_high_n_u32 (vshrn_n_u32 (a, 16), b, 16);
 }
 
 /* { dg-final { scan-assembler-times {\tuzp2\t} 1 } } */


-- 
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bff76e4b6e97db2613ab0ce1d721bf1828f0671b..c71658e2bf52b26bf9fc9fa702dd5446447f4d43 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1887,22 +1887,6 @@ (define_insn "*aarch64_topbits_shuffle_le"
   [(set_attr "type" "neon_permute")]
 )
 
-(define_insn "*aarch64_topbits_shuffle_le"
-  [(set (match_operand: 0 "register_operand" "=w")
-	(vec_concat:
-  (unspec: [
-  (match_operand:VQN 1 "register_operand" "w")
-	  (match_operand:VQN 2 "aarch64_simd_shift_imm_vec_exact_top")
-	 ] UNSPEC_RSHRN)
-	  (unspec: [
-	  (match_operand:VQN 3 "register_operand" "w")
-	  (match_dup 2)
-	 ] UNSPEC_RSHRN)))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "uzp2\\t%0., %1., %3."
-  [(set_attr "type" "neon_permute")]
-)
-
 (define_insn "*aarch64_topbits_shuffle_be"
   [(set (match_operand: 0 "register_operand" "=w")
 	(vec_concat:
@@ -1917,22 +1901,6 @@ (define_insn "*aarch64_topbits_shuffle_be"
   [(set_attr "type" "neon_permute")]
 )
 
-(define_insn "*aarch64_topbits_shuffle_be"
-  [(set (match_operand: 0 "register_operand" "=w")
-	(vec_concat:
-	  (unspec: [
-	  (match_operand:VQN 3 "register_operand" "w")
-

Re: [PATCH v2] gimple-fold: Smarter optimization of _chk variants

2021-11-10 Thread Siddhesh Poyarekar


On 11/10/21 16:15, Siddhesh Poyarekar wrote:

Instead of comparing LEN and SIZE only if they are constants, use their
ranges to decide if LEN will always be lower than or same as SIZE.

This change ends up putting the stringop-overflow warning line number
against the strcpy implementation, so adjust the warning check to be
line number agnostic.

gcc/ChangeLog:

* gimple-fold.c (known_safe): New function.
(gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk,
gimple_fold_builtin_snprintf_chk,
gimple_fold_builtin_sprintf_chk): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/Wobjsize-1.c: Make warning change line agnostic.
* gcc.dg/builtin-chk-fold.c: New test.

Signed-off-by: Siddhesh Poyarekar 
---
Changes from v1:
- Update comment that incorrectly said that known_safe emits a warning.
- Add tests for strncpy and snprintf too.


Sorry, this is failing some torture tests.  I'll fix up and send another 
version.


Siddhesh

Re: [PATCH] fixincludes: don't assume getcwd() can handle NULL argument

2021-11-10 Thread Xi Ruoyao via Gcc-patches

On Wed, 2021-11-10 at 00:02 +, Joseph Myers wrote:
> On Tue, 9 Nov 2021, Xi Ruoyao via Gcc-patches wrote:
> 
> > POSIX says:
> > 
> >     On some implementations, if buf is a null pointer, getcwd() may
> > obtain
> >     size bytes of memory using malloc(). In this case, the pointer
> > returned
> >     by getcwd() may be used as the argument in a subsequent call to
> > free().
> >     Invoking getcwd() with buf as a null pointer is not recommended
> > in
> >     conforming applications.
> > 
> > This produces an error building GCC with --enable-werror-always:
> > 
> >     ../../../fixincludes/fixincl.c: In function ‘process’:
> >     ../../../fixincludes/fixincl.c:1356:7: error: argument 1 is null
> > but
> >     the corresponding size argument 2 value is 4096 [-
> > Werror=nonnull]
> 
> Isn't this warning actually a glibc bug 
> ?

However we can't assume the libc we are using is Glibc.  Even if the
libc supports getcwd() with NULL argument, we are still leaking memory.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

Re: [PATCH] x86: Update -mtune=alderlake

2021-11-10 Thread Uros Bizjak via Gcc-patches

On Wed, Nov 10, 2021 at 10:09 AM Cui,Lili  wrote:
>
> Hi Uros,
>
> This patch is to update mtune for alderlake.
>
> Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
>
> OK for master?
>
> Update mtune for alderlake, Alder Lake Intel Hybrid Technology will not 
> support
> Intel® AVX-512. ISA features such as Intel® AVX, AVX-VNNI, Intel® AVX2, and
> UMONITOR/UMWAIT/TPAUSE are supported.
>
> gcc/ChangeLog
>
> * config/i386/i386-options.c (m_CORE_AVX2): Remove Alderlake
> from m_CORE_AVX2.
> (processor_cost_table): Use alderlake_cost for Alderlake.
> * config/i386/i386.c (ix86_sched_init_global): Handle Alderlake.
> * config/i386/x86-tune-costs.h (struct processor_costs): Add alderlake
> cost.
> * config/i386/x86-tune-sched.c (ix86_issue_rate): Change Alderlake
> issue rate to 4.
> (ix86_adjust_cost): Handle Alderlake.
> * config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Enable for Alderlake.
> (X86_TUNE_PARTIAL_REG_DEPENDENCY): Likewise.
> (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Likewise.
> (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Likewise.
> (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
> (X86_TUNE_MEMORY_MISMATCH_STALL): Likewise.
> (X86_TUNE_USE_LEAVE): Likewise.
> (X86_TUNE_PUSH_MEMORY): Likewise.
> (X86_TUNE_USE_INCDEC): Likewise.
> (X86_TUNE_INTEGER_DFMODE_MOVES): Likewise.
> (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Likewise.
> (X86_TUNE_USE_SAHF): Likewise.
> (X86_TUNE_USE_BT): Likewise.
> (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Likewise.
> (X86_TUNE_ONE_IF_CONV_INSN): Likewise.
> (X86_TUNE_AVOID_MFENCE): Likewise.
> (X86_TUNE_USE_SIMODE_FIOP): Likewise.
> (X86_TUNE_EXT_80387_CONSTANTS): Likewise.
> (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Likewise.
> (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Likewise.
> (X86_TUNE_SSE_TYPELESS_STORES): Likewise.
> (X86_TUNE_SSE_LOAD0_BY_PXOR): Likewise.
> (X86_TUNE_AVOID_4BYTE_PREFIXES): Likewise.
> (X86_TUNE_USE_GATHER): Disable for Alderlake.
> (X86_TUNE_AVX256_MOVE_BY_PIECES): Likewise.
> (X86_TUNE_AVX256_STORE_BY_PIECES): Likewise.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386-options.c   |   4 +-
>  gcc/config/i386/i386.c   |   1 +
>  gcc/config/i386/x86-tune-costs.h | 120 +++
>  gcc/config/i386/x86-tune-sched.c |   2 +
>  gcc/config/i386/x86-tune.def |  58 +++
>  5 files changed, 155 insertions(+), 30 deletions(-)
>
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index e7a3bd4aaea..a8cc0664f11 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -131,7 +131,7 @@ along with GCC; see the file COPYING3.  If not see
>| m_ICELAKE_CLIENT | m_ICELAKE_SERVER | m_CASCADELAKE \
>| m_TIGERLAKE | m_COOPERLAKE | m_SAPPHIRERAPIDS \
>| m_ROCKETLAKE)
> -#define m_CORE_AVX2 (m_HASWELL | m_SKYLAKE | m_ALDERLAKE | m_CORE_AVX512)
> +#define m_CORE_AVX2 (m_HASWELL | m_SKYLAKE | m_CORE_AVX512)
>  #define m_CORE_ALL (m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE | m_CORE_AVX2)
>  #define m_GOLDMONT (HOST_WIDE_INT_1U<  #define m_GOLDMONT_PLUS (HOST_WIDE_INT_1U< @@ -736,7 +736,7 @@ static const struct processor_costs 
> *processor_cost_table[] =
>&icelake_cost,
>&skylake_cost,
>&icelake_cost,
> -  &icelake_cost,
> +  &alderlake_cost,
>&icelake_cost,
>&intel_cost,
>&geode_cost,
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index e94efdf39fb..73c4d5115bb 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -17014,6 +17014,7 @@ ix86_sched_init_global (FILE *, int, int)
>  case PROCESSOR_SANDYBRIDGE:
>  case PROCESSOR_HASWELL:
>  case PROCESSOR_TREMONT:
> +case PROCESSOR_ALDERLAKE:
>  case PROCESSOR_GENERIC:
>/* Do not perform multipass scheduling for pre-reload schedule
>   to save compile time.  */
> diff --git a/gcc/config/i386/x86-tune-costs.h 
> b/gcc/config/i386/x86-tune-costs.h
> index 93644be9cb3..dd5563d2e64 100644
> --- a/gcc/config/i386/x86-tune-costs.h
> +++ b/gcc/config/i386/x86-tune-costs.h
> @@ -2070,6 +2070,126 @@ struct processor_costs icelake_cost = {
>"16",/* Func alignment.  */
>  };
>
> +/* alderlake_cost should produce code tuned for alderlake family of CPUs.  */
> +static stringop_algs alderlake_memcpy[2] = {
> +  {libcall,
> +   {{256, rep_prefix_1_byte, true},
> +{256, loop, false},
> +{-1, libcall, false}}},
> +  {libcall,
> +   {{256, rep_prefix_1_byte, true},
> +{256, loop, false},
> +{-1, libcall, false;
> +static stringop_algs alderlake_memset[2] = {
> +  {libcall,
> +   {{256, rep_prefix_1_byte, true},
> +

RE: [PATCH]AArch64 Remove shuffle pattern for rounding variant.

2021-11-10 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Tamar Christina 
> Sent: Wednesday, November 10, 2021 12:09 PM
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> Marcus Shawcroft ; Kyrylo Tkachov
> ; Richard Sandiford
> 
> Subject: [PATCH]AArch64 Remove shuffle pattern for rounding variant.
> 
> Hi All,
> 
> This removed the patterns to optimize the rounding shift and narrow.
> The optimization is valid only for the truncating rounding shift and narrow,
> for the rounding shift and narrow we need a different pattern that I will
> submit
> separately.
> 
> This wasn't noticed before as the benchmarks did not run conformance as
> part of
> the run, which we now do and this now passes again.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-simd.md
> (*aarch64_topbits_shuffle_le
>   ,*aarch64_topbits_shuffle_be): Remove.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/aarch64/shrn-combine-8.c: Update.
>   * gcc.target/aarch64/shrn-combine-9.c: Update.
> 
> --- inline copy of patch --
> diff --git a/gcc/config/aarch64/aarch64-simd.md
> b/gcc/config/aarch64/aarch64-simd.md
> index
> bff76e4b6e97db2613ab0ce1d721bf1828f0671b..c71658e2bf52b26bf9fc9fa70
> 2dd5446447f4d43 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1887,22 +1887,6 @@ (define_insn
> "*aarch64_topbits_shuffle_le"
>[(set_attr "type" "neon_permute")]
>  )
> 
> -(define_insn "*aarch64_topbits_shuffle_le"
> -  [(set (match_operand: 0 "register_operand" "=w")
> - (vec_concat:
> -  (unspec: [
> -  (match_operand:VQN 1 "register_operand" "w")
> -   (match_operand:VQN 2
> "aarch64_simd_shift_imm_vec_exact_top")
> -  ] UNSPEC_RSHRN)
> -   (unspec: [
> -   (match_operand:VQN 3 "register_operand" "w")
> -   (match_dup 2)
> -  ] UNSPEC_RSHRN)))]
> -  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
> -  "uzp2\\t%0., %1., %3."
> -  [(set_attr "type" "neon_permute")]
> -)
> -
>  (define_insn "*aarch64_topbits_shuffle_be"
>[(set (match_operand: 0 "register_operand" "=w")
>   (vec_concat:
> @@ -1917,22 +1901,6 @@ (define_insn
> "*aarch64_topbits_shuffle_be"
>[(set_attr "type" "neon_permute")]
>  )
> 
> -(define_insn "*aarch64_topbits_shuffle_be"
> -  [(set (match_operand: 0 "register_operand" "=w")
> - (vec_concat:
> -   (unspec: [
> -   (match_operand:VQN 3 "register_operand" "w")
> -   (match_operand:VQN 2
> "aarch64_simd_shift_imm_vec_exact_top")
> -  ] UNSPEC_RSHRN)
> -  (unspec: [
> -  (match_operand:VQN 1 "register_operand" "w")
> -   (match_dup 2)
> -  ] UNSPEC_RSHRN)))]
> -  "TARGET_SIMD && BYTES_BIG_ENDIAN"
> -  "uzp2\\t%0., %1., %3."
> -  [(set_attr "type" "neon_permute")]
> -)
> -
>  (define_expand "aarch64_shrn"
>[(set (match_operand: 0 "register_operand")
>   (truncate:
> diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine-8.c
> b/gcc/testsuite/gcc.target/aarch64/shrn-combine-8.c
> index
> 6a47f3cdaee399e603c57a1c6a0c09c6cfd21abb..c93c179632156c07f05e6067e
> 63804db35cc436b 100644
> --- a/gcc/testsuite/gcc.target/aarch64/shrn-combine-8.c
> +++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine-8.c
> @@ -6,7 +6,7 @@
> 
>  uint8x16_t foo (uint16x8_t a, uint16x8_t b)
>  {
> -  return vrshrn_high_n_u16 (vrshrn_n_u16 (a, 8), b, 8);
> +  return vshrn_high_n_u16 (vshrn_n_u16 (a, 8), b, 8);
>  }
> 
>  /* { dg-final { scan-assembler-times {\tuzp2\t} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine-9.c
> b/gcc/testsuite/gcc.target/aarch64/shrn-combine-9.c
> index
> 929a55c5c338844e6a5c5ad249af482286ab9c61..bdb3c13e5a2f89d62b6a24c2
> abe3535656399cac 100644
> --- a/gcc/testsuite/gcc.target/aarch64/shrn-combine-9.c
> +++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine-9.c
> @@ -6,7 +6,7 @@
> 
>  uint16x8_t foo (uint32x4_t a, uint32x4_t b)
>  {
> -  return vrshrn_high_n_u32 (vrshrn_n_u32 (a, 16), b, 16);
> +  return vshrn_high_n_u32 (vshrn_n_u32 (a, 16), b, 16);
>  }
> 
>  /* { dg-final { scan-assembler-times {\tuzp2\t} 1 } } */
> 
> 
> --

[committed] aarch64: Tweak FMAX/FMIN iterators

2021-11-10 Thread Richard Sandiford via Gcc-patches

There was some duplication between the maxmin_uns (uns for unspec
rather than unsigned) int attribute and the optab int attribute.
The difficulty for FMAXNM and FMINNM is that the instructions
really correspond to two things: the smax/smin optabs for floats
(used only for fast-math-like flags) and the fmax/fmin optabs
(used for built-in functions).  The optab attribute was
consistently for the former but maxmin_uns had a mixture of both.

This patch renames maxmin_uns to fmaxmin and only uses it
for the fmax and fmin optabs.  The reductions that previously
used the maxmin_uns attribute now use the optab attribute instead.

FMAX and FMIN are awkward in that they don't correspond to any
optab.  It's nevertheless useful to define them alongside the
“real” optabs.  Previously they were known as “smax_nan” and
“smin_nan”, but the problem with those names it that smax and
smin are only used for floats if NaNs don't matter.  This patch
therefore uses fmax_nan and fmin_nan instead.

There is still some inconsistency, in that the optab attribute
handles UNSPEC_COND_FMAX but the fmaxmin attribute handles
UNSPEC_FMAX.  This is because the SVE FP instructions, being
predicated, have to use unspecs in cases where the Advanced
SIMD ones could use rtl codes.

At least there are no duplicate entries though, so this seemed
like the best compromise for now.

Tested on aarch64-linux-gnu & applied.

Richard


gcc/
* config/aarch64/iterators.md (optab): Use fmax_nan instead of
smax_nan and fmin_nan instead of smin_nan.
(maxmin_uns): Rename to...
(fmaxmin): ...this and make the same changes.  Remove entries
unrelated to fmax* and fmin*.
* config/aarch64/aarch64.md (3): Rename to...
(3): ...this.
* config/aarch64/aarch64-simd.md (aarch64_p):
Rename to...
(aarch64_p): ...this.
(3): Rename to...
(3): ...this.
(reduc__scal_): Rename to...
(reduc__scal_): ...this and update gen* call.
(aarch64_reduc__internal): Rename to...
(aarch64_reduc__internal): ...this.
(aarch64_reduc__internalv2si): Rename to...
(aarch64_reduc__internalv2si): ...this.
* config/aarch64/aarch64-sve.md (3): Rename to...
(3): ...this.
* config/aarch64/aarch64-simd-builtins.def (smax_nan, smin_nan)
Rename to...
(fmax_nan, fmin_nan): ...this.
* config/aarch64/arm_neon.h (vmax_f32, vmax_f64, vmaxq_f32, vmaxq_f64)
(vmin_f32, vmin_f64, vminq_f32, vminq_f64, vmax_f16, vmaxq_f16)
(vmin_f16, vminq_f16): Update accordingly.
---
 gcc/config/aarch64/aarch64-simd-builtins.def | 12 -
 gcc/config/aarch64/aarch64-simd.md   | 24 +-
 gcc/config/aarch64/aarch64-sve.md|  2 +-
 gcc/config/aarch64/aarch64.md|  2 +-
 gcc/config/aarch64/arm_neon.h| 24 +-
 gcc/config/aarch64/iterators.md  | 26 ++--
 6 files changed, 39 insertions(+), 51 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
b/gcc/config/aarch64/aarch64-simd-builtins.def
index 4a7e2cf4125..9b0a6eceafe 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -502,21 +502,19 @@
   BUILTIN_VHSDF (UNOP, reduc_smax_nan_scal_, 10, NONE)
   BUILTIN_VHSDF (UNOP, reduc_smin_nan_scal_, 10, NONE)
 
-  /* Implemented by 3.
- smax variants map to fmaxnm,
- smax_nan variants map to fmax.  */
+  /* Implemented by 3.  */
   BUILTIN_VDQ_BHSI (BINOP, smax, 3, NONE)
   BUILTIN_VDQ_BHSI (BINOP, smin, 3, NONE)
   BUILTIN_VDQ_BHSI (BINOP, umax, 3, NONE)
   BUILTIN_VDQ_BHSI (BINOP, umin, 3, NONE)
-  BUILTIN_VHSDF_DF (BINOP, smax_nan, 3, NONE)
-  BUILTIN_VHSDF_DF (BINOP, smin_nan, 3, NONE)
 
-  /* Implemented by 3.  */
+  /* Implemented by 3.  */
   BUILTIN_VHSDF_HSDF (BINOP, fmax, 3, FP)
   BUILTIN_VHSDF_HSDF (BINOP, fmin, 3, FP)
+  BUILTIN_VHSDF_DF (BINOP, fmax_nan, 3, FP)
+  BUILTIN_VHSDF_DF (BINOP, fmin_nan, 3, FP)
 
-  /* Implemented by aarch64_p.  */
+  /* Implemented by aarch64_p.  */
   BUILTIN_VDQ_BHSI (BINOP, smaxp, 0, NONE)
   BUILTIN_VDQ_BHSI (BINOP, sminp, 0, NONE)
   BUILTIN_VDQ_BHSI (BINOP, umaxp, 0, NONE)
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index bff76e4b6e9..35d55a3e51e 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1553,7 +1553,7 @@ (define_expand "v2di3"
 })
 
 ;; Pairwise Integer Max/Min operations.
-(define_insn "aarch64_p"
+(define_insn "aarch64_p"
  [(set (match_operand:VDQ_BHSI 0 "register_operand" "=w")
(unspec:VDQ_BHSI [(match_operand:VDQ_BHSI 1 "register_operand" "w")
 (match_operand:VDQ_BHSI 2 "register_operand" "w")]
@@ -1564,7 +1564,7 @@ (define_insn "aarch64_p"
 )
 
 ;; Pairwise FP Max/Min operations.
-(define_insn "aarch64_p"
+(define_insn "aarch64_p"
  [(set (match_operand:VHSDF 0 "reg

Re: [PATCH] lto-wrapper: fix memory corruption.

2021-11-10 Thread Martin Liška


On 11/10/21 12:31, Richard Biener wrote:

Is this also latent on branches?


No, I made the refactoring early in this stage 1 in
r12-741-g227a2ecf663d69972b851f51f1934d18927b62cd

Martin

Use modref summary to DSE calls to non-pure functions

2021-11-10 Thread Jan Hubicka via Gcc-patches

Hi,
this patch implements DSE using modref summaries: if function has no side 
effects
besides storing to memory pointed to by its argument and if we can prove those 
stores
to be dead, we can optimize out. So we handle for example:

volatile int *ptr;
struct a {
int a,b,c;
} a;
__attribute__((noinline))
static int init (struct a*a)
{
a->a=0;
a->b=1;
}
__attribute__((noinline))
static int use (struct a*a)
{
if (a->c != 3)
*ptr=5;
}

void
main(void)
{
struct a a;
init (&a);
a.c=3;
use (&a);
}

And optimize out call to init (&a).

We work quite hard to inline such constructors and this patch is only
effective if inlining did not happen (for whatever reason).  Still, we
optimize about 26 calls building tramp3d and about 70 calls during
bootstrap (mostly ctors of poly_int). During bootstrap most removal
happens early and we would inline the ctors unless we decide to optimize
for size. 1 call per cc1* binary is removed late during LTO build.

This is more frequent in codebases with higher abstraction penalty, with
-Os or with profile feedback in sections optimized for size. I also hope
we will be able to CSE such calls and that would make DSE more
important.

Bootstrapped/regtested x86_64-linux, OK?

gcc/ChangeLog:

* tree-ssa-alias.c (ao_ref_alias_ptr_type): Export.
* tree-ssa-alias.h (ao_ref_init_from_ptr_and_range): Declare.
* tree-ssa-dse.c (dse_optimize_stmt): Rename to ...
(dse_optimize_store): ... this;
(dse_optimize_call): New function.
(pass_dse::execute): Use dse_optimize_call and update
call to dse_optimize_store.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/modref-dse-1.c: New test.
* gcc.dg/tree-ssa/modref-dse-2.c: New test.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-1.c
new file mode 100644
index 000..e78693b349a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-1.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse1"  } */
+volatile int *ptr;
+struct a {
+   int a,b,c;
+} a;
+__attribute__((noinline))
+static int init (struct a*a)
+{
+   a->a=0;
+   a->b=1;
+}
+__attribute__((noinline))
+static int use (struct a*a)
+{
+   if (a->c != 3)
+   *ptr=5;
+}
+
+void
+main(void)
+{
+   struct a a;
+   init (&a);
+   a.c=3;
+   use (&a);
+}
+/* { dg-final { scan-tree-dump "Deleted dead store: init" "dse1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
new file mode 100644
index 000..99c8ceb8127
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse2 -fno-ipa-sra -fno-ipa-cp"  } */
+volatile int *ptr;
+struct a {
+   int a,b,c;
+} a;
+__attribute__((noinline))
+static int init (struct a*a)
+{
+   a->a=0;
+   a->b=1;
+   a->c=1;
+}
+__attribute__((noinline))
+static int use (struct a*a)
+{
+   if (a->c != 3)
+   *ptr=5;
+}
+
+void
+main(void)
+{
+   struct a a;
+   init (&a);
+   a.c=3;
+   use (&a);
+}
+/* Only DSE2 is tracking live bytes needed to figure out that store to c is
+   also dead above.  */
+/* { dg-final { scan-tree-dump "Deleted dead store: init" "dse2" } } */
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index eabf6805f2b..affb5d40d4b 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -782,7 +782,7 @@ ao_ref_alias_ptr_type (ao_ref *ref)
The access is assumed to be only to or after of the pointer target adjusted
by the offset, not before it (even in the case RANGE_KNOWN is false).  */
 
-static void
+void
 ao_ref_init_from_ptr_and_range (ao_ref *ref, tree ptr,
bool range_known,
poly_int64 offset,
diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h
index 275dea10397..c2e28a74999 100644
--- a/gcc/tree-ssa-alias.h
+++ b/gcc/tree-ssa-alias.h
@@ -111,6 +111,8 @@ ao_ref::max_size_known_p () const
 /* In tree-ssa-alias.c  */
 extern void ao_ref_init (ao_ref *, tree);
 extern void ao_ref_init_from_ptr_and_size (ao_ref *, tree, tree);
+void ao_ref_init_from_ptr_and_range (ao_ref *, tree, bool,
+poly_int64, poly_int64, poly_int64);
 extern tree ao_ref_base (ao_ref *);
 extern alias_set_type ao_ref_alias_set (ao_ref *);
 extern alias_set_type ao_ref_base_alias_set (ao_ref *);
diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 27287fe88ee..1fec9100011 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -40,6 +40,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "tree-eh.h"
 #include "cfganal.h"
+#include "cgraph.h"
+#include "ipa-modref-tree.h"
+#include "ipa-modref.h"
 
 /* This file implements dead store eli

[PATCH 1/5] Add IFN_COND_FMIN/FMAX functions

2021-11-10 Thread Richard Sandiford via Gcc-patches

This patch adds conditional forms of FMAX and FMIN, following
the pattern for existing conditional binary functions.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* doc/md.texi (cond_fmin@var{mode}, cond_fmax@var{mode}): Document.
* optabs.def (cond_fmin_optab, cond_fmax_optab): New optabs.
* internal-fn.def (COND_FMIN, COND_FMAX): New functions.
* internal-fn.c (first_commutative_argument): Handle them.
(FOR_EACH_COND_FN_PAIR): Likewise.
* match.pd (UNCOND_BINARY, COND_BINARY): Likewise.
* config/aarch64/aarch64-sve.md (cond_): New
pattern.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_fmaxnm_5.c: New test.
* gcc.target/aarch64/sve/cond_fmaxnm_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_6.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_6_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_7.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_7_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_8.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_8_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_5.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_6.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_6_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_7.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_7_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_8.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_8_run.c: Likewise.
---
 gcc/config/aarch64/aarch64-sve.md | 19 +++-
 gcc/doc/md.texi   |  4 +++
 gcc/internal-fn.c |  4 +++
 gcc/internal-fn.def   |  2 ++
 gcc/match.pd  |  2 ++
 gcc/optabs.def|  2 ++
 .../gcc.target/aarch64/sve/cond_fmaxnm_5.c| 28 ++
 .../aarch64/sve/cond_fmaxnm_5_run.c   |  4 +++
 .../gcc.target/aarch64/sve/cond_fmaxnm_6.c| 22 ++
 .../aarch64/sve/cond_fmaxnm_6_run.c   |  4 +++
 .../gcc.target/aarch64/sve/cond_fmaxnm_7.c| 27 +
 .../aarch64/sve/cond_fmaxnm_7_run.c   |  4 +++
 .../gcc.target/aarch64/sve/cond_fmaxnm_8.c| 26 +
 .../aarch64/sve/cond_fmaxnm_8_run.c   |  4 +++
 .../gcc.target/aarch64/sve/cond_fminnm_5.c| 29 +++
 .../aarch64/sve/cond_fminnm_5_run.c   |  4 +++
 .../gcc.target/aarch64/sve/cond_fminnm_6.c| 23 +++
 .../aarch64/sve/cond_fminnm_6_run.c   |  4 +++
 .../gcc.target/aarch64/sve/cond_fminnm_7.c| 28 ++
 .../aarch64/sve/cond_fminnm_7_run.c   |  4 +++
 .../gcc.target/aarch64/sve/cond_fminnm_8.c| 27 +
 .../aarch64/sve/cond_fminnm_8_run.c   |  4 +++
 22 files changed, 274 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_5.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_5_run.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_6.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_6_run.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_7.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_7_run.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_8.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_8_run.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_5.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_5_run.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_6.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_6_run.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_7.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_7_run.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_8.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_8_run.c

diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index 5de479e141a..0f5bf5ea8cb 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -6287,7 +6287,7 @@ (define_expand "xorsign3"
 ;; -
 
 ;; Unpredicated fmax/fmin (the libm functions).  The optabs for the
-;; smin/smax rtx codes are handled in the generic section above.
+;; smax/smin rtx codes are handled in the generic section above.
 (define_expand "3"
   [(set (match_operand:SVE_FULL_F 0 "register_operand")
(unspec:SVE_FULL_F
@@ -6302,6 +6302,23 @@ (define_expand "3"
   }
 )
 
+;; Predicated fmax/fmin (the libm functions).  T

[PATCH 2/5] gimple-match: Add a gimple_extract_op function

2021-11-10 Thread Richard Sandiford via Gcc-patches

code_helper and gimple_match_op seem like generally useful ways
of summing up a gimple_assign or gimple_call (or gimple_cond).
This patch adds a gimple_extract_op function that can be used
for that.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* gimple-match.h (gimple_extract_op): Declare.
* gimple-match.c (gimple_extract): New function, extracted from...
(gimple_simplify): ...here.
(gimple_extract_op): New function.
---
 gcc/gimple-match-head.c | 261 +++-
 gcc/gimple-match.h  |   1 +
 2 files changed, 149 insertions(+), 113 deletions(-)

diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
index 9d88b2f8551..4c6e0883ba4 100644
--- a/gcc/gimple-match-head.c
+++ b/gcc/gimple-match-head.c
@@ -890,12 +890,29 @@ try_conditional_simplification (internal_fn ifn, 
gimple_match_op *res_op,
   return true;
 }
 
-/* The main STMT based simplification entry.  It is used by the fold_stmt
-   and the fold_stmt_to_constant APIs.  */
+/* Common subroutine of gimple_extract_op and gimple_simplify.  Try to
+   describe STMT in RES_OP.  Return:
 
-bool
-gimple_simplify (gimple *stmt, gimple_match_op *res_op, gimple_seq *seq,
-tree (*valueize)(tree), tree (*top_valueize)(tree))
+   - -1 if extraction failed
+   - otherwise, 0 if no simplification should take place
+   - otherwise, the number of operands for a GIMPLE_ASSIGN or GIMPLE_COND
+   - otherwise, -2 for a GIMPLE_CALL
+
+   Before recording an operand, call:
+
+   - VALUEIZE_CONDITION for a COND_EXPR condition
+   - VALUEIZE_NAME if the rhs of a GIMPLE_ASSIGN is an SSA_NAME
+   - VALUEIZE_OP for every other top-level operand
+
+   Each routine takes a tree argument and returns a tree.  */
+
+template
+inline int
+gimple_extract (gimple *stmt, gimple_match_op *res_op,
+   ValueizeOp valueize_op,
+   ValueizeCondition valueize_condition,
+   ValueizeName valueize_name)
 {
   switch (gimple_code (stmt))
 {
@@ -911,100 +928,53 @@ gimple_simplify (gimple *stmt, gimple_match_op *res_op, 
gimple_seq *seq,
|| code == VIEW_CONVERT_EXPR)
  {
tree op0 = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0);
-   bool valueized = false;
-   op0 = do_valueize (op0, top_valueize, valueized);
-   res_op->set_op (code, type, op0);
-   return (gimple_resimplify1 (seq, res_op, valueize)
-   || valueized);
+   res_op->set_op (code, type, valueize_op (op0));
+   return 1;
  }
else if (code == BIT_FIELD_REF)
  {
tree rhs1 = gimple_assign_rhs1 (stmt);
-   tree op0 = TREE_OPERAND (rhs1, 0);
-   bool valueized = false;
-   op0 = do_valueize (op0, top_valueize, valueized);
+   tree op0 = valueize_op (TREE_OPERAND (rhs1, 0));
res_op->set_op (code, type, op0,
TREE_OPERAND (rhs1, 1),
TREE_OPERAND (rhs1, 2),
REF_REVERSE_STORAGE_ORDER (rhs1));
-   if (res_op->reverse)
- return valueized;
-   return (gimple_resimplify3 (seq, res_op, valueize)
-   || valueized);
+   return res_op->reverse ? 0 : 3;
  }
-   else if (code == SSA_NAME
-&& top_valueize)
+   else if (code == SSA_NAME)
  {
tree op0 = gimple_assign_rhs1 (stmt);
-   tree valueized = top_valueize (op0);
+   tree valueized = valueize_name (op0);
if (!valueized || op0 == valueized)
- return false;
+ return -1;
res_op->set_op (TREE_CODE (op0), type, valueized);
-   return true;
+   return 0;
  }
break;
  case GIMPLE_UNARY_RHS:
{
  tree rhs1 = gimple_assign_rhs1 (stmt);
- bool valueized = false;
- rhs1 = do_valueize (rhs1, top_valueize, valueized);
- res_op->set_op (code, type, rhs1);
- return (gimple_resimplify1 (seq, res_op, valueize)
- || valueized);
+ res_op->set_op (code, type, valueize_op (rhs1));
+ return 1;
}
  case GIMPLE_BINARY_RHS:
{
- tree rhs1 = gimple_assign_rhs1 (stmt);
- tree rhs2 = gimple_assign_rhs2 (stmt);
- bool valueized = false;
- rhs1 = do_valueize (rhs1, top_valueize, valueized);
- rhs2 = do_valueize (rhs2, top_valueize, valueized);
+ tree rhs1 = valueize_op (gimple_assign_rhs1 (stmt));
+ tree rhs2 = valueize_op (gimple_assign_rhs2 (stmt));
  res_op->set_op (code, type, rhs1, rhs2);
-

[PATCH 3/5] gimple-match: Make code_helper conversions explicit

2021-11-10 Thread Richard Sandiford via Gcc-patches

code_helper provides conversions to tree_code and combined_fn.
Now that the codebase is C++11, we can mark these conversions as
explicit.  This avoids accidentally using code_helpers with
functions that take tree_codes, which would previously entail
a hidden unchecked conversion.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* gimple-match.h (code_helper): Provide == and != overloads.
(code_helper::operator tree_code): Make explicit.
(code_helper::operator combined_fn): Likewise.
* gimple-match-head.c (convert_conditional_op): Use explicit
conversions where necessary.
(gimple_resimplify1, gimple_resimplify2, gimple_resimplify3): Likewise.
(maybe_push_res_to_seq, gimple_simplify): Likewise.
* gimple-fold.c (replace_stmt_with_simplification): Likewise.
---
 gcc/gimple-fold.c   | 18 ---
 gcc/gimple-match-head.c | 51 ++---
 gcc/gimple-match.h  |  9 ++--
 3 files changed, 45 insertions(+), 33 deletions(-)

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 6e25a7c05db..9daf2cc590c 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -5828,18 +5828,19 @@ replace_stmt_with_simplification (gimple_stmt_iterator 
*gsi,
   if (gcond *cond_stmt = dyn_cast  (stmt))
 {
   gcc_assert (res_op->code.is_tree_code ());
-  if (TREE_CODE_CLASS ((enum tree_code) res_op->code) == tcc_comparison
+  auto code = tree_code (res_op->code);
+  if (TREE_CODE_CLASS (code) == tcc_comparison
  /* GIMPLE_CONDs condition may not throw.  */
  && (!flag_exceptions
  || !cfun->can_throw_non_call_exceptions
- || !operation_could_trap_p (res_op->code,
+ || !operation_could_trap_p (code,
  FLOAT_TYPE_P (TREE_TYPE (ops[0])),
  false, NULL_TREE)))
-   gimple_cond_set_condition (cond_stmt, res_op->code, ops[0], ops[1]);
-  else if (res_op->code == SSA_NAME)
+   gimple_cond_set_condition (cond_stmt, code, ops[0], ops[1]);
+  else if (code == SSA_NAME)
gimple_cond_set_condition (cond_stmt, NE_EXPR, ops[0],
   build_zero_cst (TREE_TYPE (ops[0])));
-  else if (res_op->code == INTEGER_CST)
+  else if (code == INTEGER_CST)
{
  if (integer_zerop (ops[0]))
gimple_cond_make_false (cond_stmt);
@@ -5870,11 +5871,12 @@ replace_stmt_with_simplification (gimple_stmt_iterator 
*gsi,
   else if (is_gimple_assign (stmt)
   && res_op->code.is_tree_code ())
 {
+  auto code = tree_code (res_op->code);
   if (!inplace
- || gimple_num_ops (stmt) > get_gimple_rhs_num_ops (res_op->code))
+ || gimple_num_ops (stmt) > get_gimple_rhs_num_ops (code))
{
  maybe_build_generic_op (res_op);
- gimple_assign_set_rhs_with_ops (gsi, res_op->code,
+ gimple_assign_set_rhs_with_ops (gsi, code,
  res_op->op_or_null (0),
  res_op->op_or_null (1),
  res_op->op_or_null (2));
@@ -5891,7 +5893,7 @@ replace_stmt_with_simplification (gimple_stmt_iterator 
*gsi,
}
 }
   else if (res_op->code.is_fn_code ()
-  && gimple_call_combined_fn (stmt) == res_op->code)
+  && gimple_call_combined_fn (stmt) == combined_fn (res_op->code))
 {
   gcc_assert (num_ops == gimple_call_num_args (stmt));
   for (unsigned int i = 0; i < num_ops; ++i)
diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
index 4c6e0883ba4..d4d7d767075 100644
--- a/gcc/gimple-match-head.c
+++ b/gcc/gimple-match-head.c
@@ -96,7 +96,7 @@ convert_conditional_op (gimple_match_op *orig_op,
 ifn = get_conditional_internal_fn ((tree_code) orig_op->code);
   else
 {
-  combined_fn cfn = orig_op->code;
+  auto cfn = combined_fn (orig_op->code);
   if (!internal_fn_p (cfn))
return false;
   ifn = get_conditional_internal_fn (as_internal_fn (cfn));
@@ -206,10 +206,10 @@ gimple_resimplify1 (gimple_seq *seq, gimple_match_op 
*res_op,
   tree tem = NULL_TREE;
   if (res_op->code.is_tree_code ())
{
- tree_code code = res_op->code;
+ auto code = tree_code (res_op->code);
  if (IS_EXPR_CODE_CLASS (TREE_CODE_CLASS (code))
  && TREE_CODE_LENGTH (code) == 1)
-   tem = const_unop (res_op->code, res_op->type, res_op->ops[0]);
+   tem = const_unop (code, res_op->type, res_op->ops[0]);
}
   else
tem = fold_const_call (combined_fn (res_op->code), res_op->type,
@@ -272,10 +272,10 @@ gimple_resimplify2 (gimple_seq *seq, gimple_match_op 
*res_op,
   tree tem = NULL_TREE;
   if (res_op->code.is_tree_code ())
{
- tree_code code = res_op->code;
+ auto code = tree_code (res_op->c

[PATCH 4/5] vect: Make reduction code handle calls

2021-11-10 Thread Richard Sandiford via Gcc-patches

This patch extends the reduction code to handle calls.  So far
it's a structural change only; a later patch adds support for
specific function reductions.

Most of the patch consists of using code_helper and gimple_match_op
to describe the reduction operations.  The other main change is that
vectorizable_call now needs to handle fully-predicated reductions.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* builtins.h (associated_internal_fn): Declare overload that
takes a (combined_cfn, return type) pair.
* builtins.c (associated_internal_fn): Split new overload out
of original fndecl version.  Also provide an overload that takes
a (combined_cfn, return type) pair.
* internal-fn.h (commutative_binary_fn_p): Declare.
(associative_binary_fn_p): Likewise.
* internal-fn.c (commutative_binary_fn_p): New function,
split out from...
(first_commutative_argument): ...here.
(associative_binary_fn_p): New function.
* gimple-match.h (code_helper): Add a constructor that takes
internal functions.
(commutative_binary_op_p): Declare.
(associative_binary_op_p): Likewise.
(canonicalize_code): Likewise.
(directly_supported_p): Likewise.
(get_conditional_internal_fn): Likewise.
(gimple_build): New overload that takes a code_helper.
* gimple-fold.c (gimple_build): Likewise.
* gimple-match-head.c (commutative_binary_op_p): New function.
(associative_binary_op_p): Likewise.
(canonicalize_code): Likewise.
(directly_supported_p): Likewise.
(get_conditional_internal_fn): Likewise.
* tree-vectorizer.h: Include gimple-match.h.
(neutral_op_for_reduction): Take a code_helper instead of a tree_code.
(needs_fold_left_reduction_p): Likewise.
(reduction_fn_for_scalar_code): Likewise.
(vect_can_vectorize_without_simd_p): Declare a nNew overload that takes
a code_helper.
* tree-vect-loop.c: Include case-cfn-macros.h.
(fold_left_reduction_fn): Take a code_helper instead of a tree_code.
(reduction_fn_for_scalar_code): Likewise.
(neutral_op_for_reduction): Likewise.
(needs_fold_left_reduction_p): Likewise.
(use_mask_by_cond_expr_p): Likewise.
(build_vect_cond_expr): Likewise.
(vect_create_partial_epilog): Likewise.  Use gimple_build rather
than gimple_build_assign.
(check_reduction_path): Handle calls and operate on code_helpers
rather than tree_codes.
(vect_is_simple_reduction): Likewise.
(vect_model_reduction_cost): Likewise.
(vect_find_reusable_accumulator): Likewise.
(vect_create_epilog_for_reduction): Likewise.
(vect_transform_cycle_phi): Likewise.
(vectorizable_reduction): Likewise.  Make more use of
lane_reduc_code_p.
(vect_transform_reduction): Use gimple_extract_op but expect
a tree_code for now.
(vect_can_vectorize_without_simd_p): New overload that takes
a code_helper.
* tree-vect-stmts.c (vectorizable_call): Handle reductions in
fully-masked loops.
* tree-vect-patterns.c (vect_mark_pattern_stmts): Use
gimple_extract_op when updating STMT_VINFO_REDUC_IDX.
---
 gcc/builtins.c   |  46 -
 gcc/builtins.h   |   1 +
 gcc/gimple-fold.c|   9 +
 gcc/gimple-match-head.c  |  70 +++
 gcc/gimple-match.h   |  20 ++
 gcc/internal-fn.c|  46 -
 gcc/internal-fn.h|   2 +
 gcc/tree-vect-loop.c | 420 +++
 gcc/tree-vect-patterns.c |  23 ++-
 gcc/tree-vect-stmts.c|  66 --
 gcc/tree-vectorizer.h|  10 +-
 11 files changed, 455 insertions(+), 258 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 384864bfb3a..03829c03a5a 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2139,17 +2139,17 @@ mathfn_built_in_type (combined_fn fn)
 #undef SEQ_OF_CASE_MATHFN
 }
 
-/* If BUILT_IN_NORMAL function FNDECL has an associated internal function,
-   return its code, otherwise return IFN_LAST.  Note that this function
-   only tests whether the function is defined in internals.def, not whether
-   it is actually available on the target.  */
+/* Check whether there is an internal function associated with function FN
+   and return type RETURN_TYPE.  Return the function if so, otherwise return
+   IFN_LAST.
 
-internal_fn
-associated_internal_fn (tree fndecl)
+   Note that this function only tests whether the function is defined in
+   internals.def, not whether it is actually available on the target.  */
+
+static internal_fn
+associated_internal_fn (built_in_function fn, tree return_type)
 {
-  gcc_checking_assert (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL);
-  tree return_type = TREE_TYPE (TREE_TYPE (fndecl));
-  switch (DECL_FUNCTION_CODE (fndecl))
+  switch (

[PATCH 5/5] vect: Add support for fmax and fmin reductions

2021-11-10 Thread Richard Sandiford via Gcc-patches

This patch adds support for reductions involving calls to fmax*()
and fmin*(), without the -ffast-math flags that allow them to be
converted to MAX_EXPR and MIN_EXPR.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* doc/md.texi (reduc_fmin_scal_@var{m}): Document.
(reduc_fmax_scal_@var{m}): Likewise.
* optabs.def (reduc_fmax_scal_optab): New optab.
(reduc_fmin_scal_optab): Likewise
* internal-fn.def (REDUC_FMAX, REDUC_FMIN): New functions.
* tree-vect-loop.c (reduction_fn_for_scalar_code): Handle
CASE_CFN_FMAX and CASE_CFN_FMIN.
(neutral_op_for_reduction): Likewise.
(needs_fold_left_reduction_p): Likewise.
* config/aarch64/iterators.md (FMAXMINV): New iterator.
(fmaxmin): Handle UNSPEC_FMAXNMV and UNSPEC_FMINNMV.
* config/aarch64/aarch64-simd.md (reduc__scal_): Fix
unspec mode.
(reduc__scal_): New pattern.
* config/aarch64/aarch64-sve.md (reduc__scal_):
Likewise.

gcc/testsuite/
* gcc.dg/vect/vect-fmax-1.c: New test.
* gcc.dg/vect/vect-fmax-2.c: Likewise.
* gcc.dg/vect/vect-fmax-3.c: Likewise.
* gcc.dg/vect/vect-fmin-1.c: New test.
* gcc.dg/vect/vect-fmin-2.c: Likewise.
* gcc.dg/vect/vect-fmin-3.c: Likewise.
* gcc.target/aarch64/fmaxnm_1.c: Likewise.
* gcc.target/aarch64/fmaxnm_2.c: Likewise.
* gcc.target/aarch64/fminnm_1.c: Likewise.
* gcc.target/aarch64/fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/fmaxnm_1.c: Likewise.
* gcc.target/aarch64/sve/fmaxnm_2.c: Likewise.
* gcc.target/aarch64/sve/fminnm_1.c: Likewise.
* gcc.target/aarch64/sve/fminnm_2.c: Likewise.
---
 gcc/config/aarch64/aarch64-simd.md| 15 +++-
 gcc/config/aarch64/aarch64-sve.md | 11 +++
 gcc/config/aarch64/iterators.md   |  4 +
 gcc/doc/md.texi   |  8 ++
 gcc/internal-fn.def   |  4 +
 gcc/optabs.def|  2 +
 gcc/testsuite/gcc.dg/vect/vect-fmax-1.c   | 83 ++
 gcc/testsuite/gcc.dg/vect/vect-fmax-2.c   |  7 ++
 gcc/testsuite/gcc.dg/vect/vect-fmax-3.c   | 83 ++
 gcc/testsuite/gcc.dg/vect/vect-fmin-1.c   | 86 +++
 gcc/testsuite/gcc.dg/vect/vect-fmin-2.c   |  9 ++
 gcc/testsuite/gcc.dg/vect/vect-fmin-3.c   | 83 ++
 gcc/testsuite/gcc.target/aarch64/fmaxnm_1.c   | 24 ++
 gcc/testsuite/gcc.target/aarch64/fmaxnm_2.c   | 20 +
 gcc/testsuite/gcc.target/aarch64/fminnm_1.c   | 24 ++
 gcc/testsuite/gcc.target/aarch64/fminnm_2.c   | 20 +
 .../gcc.target/aarch64/sve/fmaxnm_2.c | 22 +
 .../gcc.target/aarch64/sve/fmaxnm_3.c | 18 
 .../gcc.target/aarch64/sve/fminnm_2.c | 22 +
 .../gcc.target/aarch64/sve/fminnm_3.c | 18 
 gcc/tree-vect-loop.c  | 45 --
 21 files changed, 599 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-fmax-1.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-fmax-2.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-fmax-3.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-fmin-1.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-fmin-2.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-fmin-3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/fmaxnm_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/fmaxnm_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/fminnm_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/fminnm_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/fmaxnm_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/fmaxnm_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/fminnm_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/fminnm_3.c

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 35d55a3e51e..8e7d783f7f3 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3624,8 +3624,8 @@ (define_insn "popcount2"
 ;; gimple_fold'd to the IFN_REDUC_(MAX|MIN) function.  (This is FP smax/smin).
 (define_expand "reduc__scal_"
   [(match_operand: 0 "register_operand")
-   (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")]
- FMAXMINV)]
+   (unspec: [(match_operand:VHSDF 1 "register_operand")]
+FMAXMINV)]
   "TARGET_SIMD"
   {
 rtx elt = aarch64_endian_lane_rtx (mode, 0);
@@ -3637,6 +3637,17 @@ (define_expand "reduc__scal_"
   }
 )
 
+(define_expand "reduc__scal_"
+  [(match_operand: 0 "register_operand")
+   (unspec: [(match_operand:VHSDF 1 "register_operand")]
+FMAXMINNMV)]
+  "TARGET_SIMD"
+  {
+emit_insn (gen_reduc__scal_ (operands[0], operands[1]));
+DONE;
+  }
+)
+
 ;; Likewise for integer cases, signed and unsigned.

[PATCH 1/4] Canonicalize argument order for commutative functions

2021-11-10 Thread Richard Sandiford via Gcc-patches

This patch uses information about internal functions to canonicalize
the argument order of calls.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* gimple-fold.c: Include internal-fn.h.
(fold_stmt_1): If a function maps to an internal one, use
first_commutative_argument to canonicalize the order of
commutative arguments.

gcc/testsuite/
* gcc.dg/fmax-fmin-1.c: New test.
---
 gcc/gimple-fold.c  | 25 ++---
 gcc/testsuite/gcc.dg/fmax-fmin-1.c | 18 ++
 2 files changed, 40 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/fmax-fmin-1.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index a937f130815..6a7d4507c89 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "varasm.h"
 #include "memmodel.h"
 #include "optabs.h"
+#include "internal-fn.h"
 
 enum strlen_range_kind {
   /* Compute the exact constant string length.  */
@@ -6140,18 +6141,36 @@ fold_stmt_1 (gimple_stmt_iterator *gsi, bool inplace, 
tree (*valueize) (tree))
   break;
 case GIMPLE_CALL:
   {
-   for (i = 0; i < gimple_call_num_args (stmt); ++i)
+   gcall *call = as_a (stmt);
+   for (i = 0; i < gimple_call_num_args (call); ++i)
  {
-   tree *arg = gimple_call_arg_ptr (stmt, i);
+   tree *arg = gimple_call_arg_ptr (call, i);
if (REFERENCE_CLASS_P (*arg)
&& maybe_canonicalize_mem_ref_addr (arg))
  changed = true;
  }
-   tree *lhs = gimple_call_lhs_ptr (stmt);
+   tree *lhs = gimple_call_lhs_ptr (call);
if (*lhs
&& REFERENCE_CLASS_P (*lhs)
&& maybe_canonicalize_mem_ref_addr (lhs))
  changed = true;
+   if (*lhs)
+ {
+   combined_fn cfn = gimple_call_combined_fn (call);
+   internal_fn ifn = associated_internal_fn (cfn, TREE_TYPE (*lhs));
+   int opno = first_commutative_argument (ifn);
+   if (opno >= 0)
+ {
+   tree arg1 = gimple_call_arg (call, opno);
+   tree arg2 = gimple_call_arg (call, opno + 1);
+   if (tree_swap_operands_p (arg1, arg2))
+ {
+   gimple_call_set_arg (call, opno, arg2);
+   gimple_call_set_arg (call, opno + 1, arg1);
+   changed = true;
+ }
+ }
+ }
break;
   }
 case GIMPLE_ASM:
diff --git a/gcc/testsuite/gcc.dg/fmax-fmin-1.c 
b/gcc/testsuite/gcc.dg/fmax-fmin-1.c
new file mode 100644
index 000..e7e0518d8bb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fmax-fmin-1.c
@@ -0,0 +1,18 @@
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+void
+f1 (double *res, double x, double y)
+{
+  res[0] = __builtin_fmax (x, y);
+  res[1] = __builtin_fmax (y, x);
+}
+
+void
+f2 (double *res, double x, double y)
+{
+  res[0] = __builtin_fmin (x, y);
+  res[1] = __builtin_fmin (y, x);
+}
+
+/* { dg-final { scan-tree-dump-times {__builtin_fmax} 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times {__builtin_fmin} 1 "optimized" } } */
-- 
2.25.1

[PATCH 2/4] Mark IFN_COMPLEX_MUL as commutative

2021-11-10 Thread Richard Sandiford via Gcc-patches

Mark IFN_COMPLEX_MUL as commutative.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* internal-fn.c (commutative_binary_fn_p): Handle IFN_COMPLEX_MUL.

gcc/testsuite/
* gcc.target/aarch64/sve/complex_mul_1.c: New test.
---
 gcc/internal-fn.c|  1 +
 .../gcc.target/aarch64/sve/complex_mul_1.c   | 16 
 2 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/complex_mul_1.c

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 7b13db6dfe3..ff7d43f1801 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3829,6 +3829,7 @@ commutative_binary_fn_p (internal_fn fn)
 case IFN_MULHRS:
 case IFN_FMIN:
 case IFN_FMAX:
+case IFN_COMPLEX_MUL:
   return true;
 
 default:
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/complex_mul_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/complex_mul_1.c
new file mode 100644
index 000..d197e7d0d8e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/complex_mul_1.c
@@ -0,0 +1,16 @@
+/* { dg-options "-O2 -fgimple -fdump-tree-optimized" } */
+
+void __GIMPLE
+foo (__SVFloat64_t x, __SVFloat64_t y, __SVFloat64_t *res1,
+ __SVFloat64_t *res2)
+{
+  __SVFloat64_t a1;
+  __SVFloat64_t a2;
+
+  a1 = .COMPLEX_MUL (x, y);
+  a2 = .COMPLEX_MUL (y, x);
+  __MEM<__SVFloat64_t> (res1) = a1;
+  __MEM<__SVFloat64_t> (res2) = a2;
+}
+
+/* { dg-final { scan-tree-dump-times {\.COMPLEX_MUL} 1 "optimized" } } */
-- 
2.25.1

[PATCH 3/4] Mark IFN_UBSAN_CHECK_ADD/MUL as commutative

2021-11-10 Thread Richard Sandiford via Gcc-patches

Mark IFN_UBSAN_CHECK_ADD/MUL as commutative.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* internal-fn.c (commutative_binary_fn_p): Handle IFN_UBSAN_CHECK_ADD
and IFN_UBSAN_CHECK_MUL.

gcc/testsuite/
* gcc.dg/ubsan/commutative-1.c: New test.
---
 gcc/internal-fn.c  |  2 ++
 gcc/testsuite/gcc.dg/ubsan/commutative-1.c | 30 ++
 2 files changed, 32 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/commutative-1.c

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index ff7d43f1801..b64555ada36 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3830,6 +3830,8 @@ commutative_binary_fn_p (internal_fn fn)
 case IFN_FMIN:
 case IFN_FMAX:
 case IFN_COMPLEX_MUL:
+case IFN_UBSAN_CHECK_ADD:
+case IFN_UBSAN_CHECK_MUL:
   return true;
 
 default:
diff --git a/gcc/testsuite/gcc.dg/ubsan/commutative-1.c 
b/gcc/testsuite/gcc.dg/ubsan/commutative-1.c
new file mode 100644
index 000..128f5b14697
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/commutative-1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=undefined -fdump-tree-optimized" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-flto" } { "" } } */
+
+int res[2];
+
+void
+f1 (int x, int y)
+{
+  res[0] = x + y;
+  res[1] = y + x;
+}
+
+void
+f2 (int x, int y)
+{
+  res[0] = x - y;
+  res[1] = y - x;
+}
+
+void
+f3 (int x, int y)
+{
+  res[0] = x * y;
+  res[1] = y * x;
+}
+
+/* { dg-final { scan-tree-dump-times {\.UBSAN_CHECK_ADD} 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times {\.UBSAN_CHECK_SUB} 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times {\.UBSAN_CHECK_MUL} 1 "optimized" } } */
-- 
2.25.1

[PATCH 4/4] Mark IFN_ADD/MUL_OVERFLOW as commutative

2021-11-10 Thread Richard Sandiford via Gcc-patches

Mark IFN_ADD/MUL_OVERFLOW as commutative.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* internal-fn.c (first_commutative_operand): Handle IFN_ADD_OVERFLOW
and IFN_MUL_OVERFLOW.

gcc/testsuite/
* gcc.dg/add-mul-overflow-1.c: New test.
---
 gcc/internal-fn.c |  2 ++
 gcc/testsuite/gcc.dg/add-mul-overflow-1.c | 28 +++
 2 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/add-mul-overflow-1.c

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index b64555ada36..10f08182f7e 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3867,6 +3867,8 @@ first_commutative_argument (internal_fn fn)
 case IFN_FMS:
 case IFN_FNMA:
 case IFN_FNMS:
+case IFN_ADD_OVERFLOW:
+case IFN_MUL_OVERFLOW:
   return 0;
 
 case IFN_COND_ADD:
diff --git a/gcc/testsuite/gcc.dg/add-mul-overflow-1.c 
b/gcc/testsuite/gcc.dg/add-mul-overflow-1.c
new file mode 100644
index 000..b23cdddbb63
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/add-mul-overflow-1.c
@@ -0,0 +1,28 @@
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int res[4];
+
+void
+f1 (int x, int y)
+{
+  res[2] = __builtin_add_overflow (x, y, res + 0);
+  res[3] = __builtin_add_overflow (y, x, res + 1);
+}
+
+void
+f2 (int x, int y)
+{
+  res[2] = __builtin_sub_overflow (x, y, res + 0);
+  res[3] = __builtin_sub_overflow (y, x, res + 1);
+}
+
+void
+f3 (int x, int y)
+{
+  res[2] = __builtin_mul_overflow (x, y, res + 0);
+  res[3] = __builtin_mul_overflow (y, x, res + 1);
+}
+
+/* { dg-final { scan-tree-dump-times {\.ADD_OVERFLOW} 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times {\.SUB_OVERFLOW} 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times {\.MUL_OVERFLOW} 1 "optimized" } } */
-- 
2.25.1

[PATCH]Arm Update missing entries of cost tables

2021-11-10 Thread Tamar Christina via Gcc-patches

Hi All,

My previous patch missed these tuning structures in arm.c as they
are not where the rest of the structure are located.

This applies the same default values to silence initialization
warnings.

Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* config/arm/arm.c (cortexa9_extra_costs, cortexa8_extra_costs,
cortexa5_extra_costs, cortexa5_extra_cost, cortexa12_extra_costs,
cortexa15_extra_costs, v7m_extra_costs): Add new entries.

--- inline copy of patch -- 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
c4ff06b087ebecc91c419cb4ecf009c3535955df..625b97f6a67f739caa92fa3385a4c90b03b43e09
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1190,6 +1190,9 @@ const struct cpu_cost_table cortexa9_extra_costs =
   {
 COSTS_N_INSNS (1), /* alu.  */
 COSTS_N_INSNS (4)  /* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1294,6 +1297,9 @@ const struct cpu_cost_table cortexa8_extra_costs =
   {
 COSTS_N_INSNS (1), /* alu.  */
 COSTS_N_INSNS (4)  /* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1399,6 +1405,9 @@ const struct cpu_cost_table cortexa5_extra_costs =
   {
 COSTS_N_INSNS (1), /* alu.  */
 COSTS_N_INSNS (4)  /* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1505,6 +1514,9 @@ const struct cpu_cost_table cortexa7_extra_costs =
   {
 COSTS_N_INSNS (1), /* alu.  */
 COSTS_N_INSNS (4)  /* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1609,6 +1621,9 @@ const struct cpu_cost_table cortexa12_extra_costs =
   {
 COSTS_N_INSNS (1), /* alu.  */
 COSTS_N_INSNS (4)  /* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1713,6 +1728,9 @@ const struct cpu_cost_table cortexa15_extra_costs =
   {
 COSTS_N_INSNS (1), /* alu.  */
 COSTS_N_INSNS (4)  /* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1817,6 +1835,9 @@ const struct cpu_cost_table v7m_extra_costs =
   {
 COSTS_N_INSNS (1), /* alu.  */
 COSTS_N_INSNS (4)  /* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 


-- 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c4ff06b087ebecc91c419cb4ecf009c3535955df..625b97f6a67f739caa92fa3385a4c90b03b43e09 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1190,6 +1190,9 @@ const struct cpu_cost_table cortexa9_extra_costs =
   {
 COSTS_N_INSNS (1),	/* alu.  */
 COSTS_N_INSNS (4)	/* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1294,6 +1297,9 @@ const struct cpu_cost_table cortexa8_extra_costs =
   {
 COSTS_N_INSNS (1),	/* alu.  */
 COSTS_N_INSNS (4)	/* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1399,6 +1405,9 @@ const struct cpu_cost_table cortexa5_extra_costs =
   {
 COSTS_N_INSNS (1),	/* alu.  */
 COSTS_N_INSNS (4)	/* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1505,6 +1514,9 @@ const struct cpu_cost_table cortexa7_extra_costs =
   {
 COSTS_N_INSNS (1),	/* alu.  */
 COSTS_N_INSNS (4)	/* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1609,6 +1621,9 @@ const struct cpu_cost_table cortexa12_extra_costs =
   {
 COSTS_N_INSNS (1),	/* alu.  */
 COSTS_N_INSNS (4)	/* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1713,6 +1728,9 @@ const struct cpu_cost_table cortexa15_extra_costs =
   {
 COSTS_N_INSNS (1),	/* alu.  */
 COSTS_N_INSNS (4)	/* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };
 
@@ -1817,6 +1835,9 @@ const struct cpu_cost_table v7m_extra_costs =
   {
 COSTS_N_INSNS (1),	/* alu.  */
 COSTS_N_INSNS (4)	/* mult.  */
+COSTS_N_INSNS (1),  /* movi.  */
+COSTS_N_INSNS (2),  /* dup.  */
+COSTS_N_INSNS (2)   /* extract.  */
   }
 };

Re: [PATCH 3/4] Mark IFN_UBSAN_CHECK_ADD/MUL as commutative

2021-11-10 Thread Jakub Jelinek via Gcc-patches

On Wed, Nov 10, 2021 at 12:49:36PM +, Richard Sandiford via Gcc-patches 
wrote:
>   * internal-fn.c (commutative_binary_fn_p): Handle IFN_UBSAN_CHECK_ADD
>   and IFN_UBSAN_CHECK_MUL.
> 
> gcc/testsuite/
>   * gcc.dg/ubsan/commutative-1.c: New test.

Ok.

Jakub

Re: [PATCH 4/4] Mark IFN_ADD/MUL_OVERFLOW as commutative

2021-11-10 Thread Jakub Jelinek via Gcc-patches

On Wed, Nov 10, 2021 at 12:50:09PM +, Richard Sandiford via Gcc-patches 
wrote:
> gcc/
>   * internal-fn.c (first_commutative_operand): Handle IFN_ADD_OVERFLOW
>   and IFN_MUL_OVERFLOW.
> 
> gcc/testsuite/
>   * gcc.dg/add-mul-overflow-1.c: New test.

Ok.

Jakub

[committed][AArch64] Fix bootstrap failure due to missing ATTRIBUTE_UNUSED,andsim01,Wed 10-Nov-21 12:58 PM,View with a light background,Like,Reply,Reply all,Forward

2021-11-10 Thread Andre Vieira (lists) via Gcc-patches


Hi,

Committed this as obvious. My earlier patch removed the need for the GSI 
to be used.


gcc/ChangeLog:

    * config/aarch64/aarch64-builtins.c
    (aarch64_general_gimple_fold_builtin): Mark argument as unused.
diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 
e06131a7c61d31c1be3278dcdccc49c3053c78cb..d5b16081264ca43416a53dafb8c6ee6efad88133
 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -2458,7 +2458,7 @@ get_mem_type_for_load_store (unsigned int fcode)
failure.  */
 gimple *
 aarch64_general_gimple_fold_builtin (unsigned int fcode, gcall *stmt,
-gimple_stmt_iterator *gsi)
+gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED)
 {
   gimple *new_stmt = NULL;
   unsigned nargs = gimple_call_num_args (stmt);

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-10 Thread Martin Liška


On 11/10/21 09:59, Richard Biener wrote:

On Tue, Nov 9, 2021 at 5:44 PM Martin Liška  wrote:


On 11/9/21 14:37, Richard Biener wrote:

On Mon, Nov 8, 2021 at 8:45 PM Andrew MacLeod  wrote:


On 11/8/21 10:05 AM, Martin Liška wrote:

On 9/28/21 22:39, Andrew MacLeod wrote:

In Theory, modifying the IL should be fine, it happens already in
places, but its not extensively tested under those conditions yet.


Hello Andrew.

I've just tried using a global gimple_ranger and it crashes when loop
unswitching duplicates
some BBs.

Please try the attached patch for:


hey Martin,

try using this in your tree.  Since nothing else is using a growing BB
right now, I'll let you work with it and see if everything works as
expected before checking it in, just in case we need more tweaking.
With this,

make RUNTESTFLAGS=dg.exp=loop-unswitch*.c check-gcc

runs clean.


basically, I tried to grow it by either a factor of 10% for the current
BB size when the grow is requested, or some double the needed extra
size, or 128... whichever value is "maximum"That means it shoudnt be
asking for tooo much each time, but also not a minimum amount.

Im certainly open to suggestion on how much to grow it each time.
Note the vector being grown is ONLY fo the SSA_NAme being asked for.. so
it really an on-demand thing just for specific names, in your case,
mostly just the switch index.

Let me know how this works for you, and if you have any other issues.


So I think in the end we shouldn't need the growing.  Ideally we'd do all
the analysis before the first transform, but for that we'd need ranger to
be able to "simplify" conditions based on a known true/false predicate
that's not yet in the IL.  Consider

   for (;;)
 {
  if (invariant < 3) // A
{
...
}
  if (invariant < 5) // B
{
...
}
 }

unswitch analysis will run into the condition 'A' and determine the loop
can be unswitched with the condition 'invariant < 3'.  To be able to
perform cost assessment and to avoid redundant unswitching we
want to determine that if we unswitch with 'invariant < 3' being
true then the condition at 'B' is true as well before actually inserting
the if (invariant < 3) outside of the loop.

So I'm thinking of assigning a gimple_uid to each condition we want to
unswitch on and have an array indexed by the uid with meta-data on
the unswitch opportunity, the "related" conditions could be marked with
the same uid (or some other), and the folding result recorded so that
at transform time we can just do the appropriate replacement without
invoking ranger again.


Calculating all this before transformation is quite ambitious based on the code
we have now.

Note one can have in a loop:

if (a > 100)
 ...

switch (a)
 case 1000:
   ...
 case 20:
   ...
 case 200:
   ...

which means the first predicate effectively makes some cases unreachable. 
Moreover
one can have

if (a > 100 && b < 300)
 ...

and more complex conditions.


True - I guess we should do two things.


All right.



  1) keep simplify_using_entry_checks like code for symbolic conditions
  2) add integer ranges for unswitch conditions producing them, that
  includes all unswitching of switch stmts - we might be able to use
  the ranger queries (with global ranges) to simplify stmts with the
  known ranges as noted by Andrew

I do think that pre-computing the simplifications is what we should do
to be able to make the cost modeling sane.  What we can avoid
trying is evaluating multiple unswitch possibilities to pick the "best".


So the first step would be taking all unswitching candidates (gconds basically)
and grouping them (all items in a group would fold to true edge in the 
unswitched loop).
Is it something we want to do combining simplify_using_entry_checks and 
fold_range ranger
capability?



I think changing the code do to the analysis first should be done
before wiring in gcond support, even adding the additional 'range'


s/gcond/switch, right?


capability will be useful without that since the current code
wont figure out a > 5 is true when we unswitch on a > 3.


Agree that gswitch support should be added later.

Martin





Now, but how do we arrange for the ranger analysis here?


That's likely something we need support from ranger, yes.



We might also somehow want to remember that on the
'invariant < 3' == false copy of the loop there's still the
unswitching opportunity on 'invariant < 5', but not on the
'invariant < 5' == true copy.

Currently unswitching uses a custom simplify_using_entry_checks
which tries to do simplification only after the fact (and so costing
also is far from costing the true cost and ordering of the opportunities
to do the best first is not implemented either).


I'm sending updated version of the patch where I changed:
- simplify_using_entry_checks is put back for the floating point expressions
- all scans utilize scan-tree-dump-times
- som

Re: [PATCH]Arm Update missing entries of cost tables

2021-11-10 Thread Christophe Lyon via Gcc-patches

On Wed, Nov 10, 2021 at 1:54 PM Tamar Christina via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> Hi All,
>
> My previous patch missed these tuning structures in arm.c as they
> are not where the rest of the structure are located.
>
> This applies the same default values to silence initialization
> warnings.
>

Thanks, FWIW it looks like the patch I sent on Monday:
 https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583710.html

Christophe


> Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * config/arm/arm.c (cortexa9_extra_costs, cortexa8_extra_costs,
> cortexa5_extra_costs, cortexa5_extra_cost, cortexa12_extra_costs,
> cortexa15_extra_costs, v7m_extra_costs): Add new entries.
>
> --- inline copy of patch --
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index
> c4ff06b087ebecc91c419cb4ecf009c3535955df..625b97f6a67f739caa92fa3385a4c90b03b43e09
> 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -1190,6 +1190,9 @@ const struct cpu_cost_table cortexa9_extra_costs =
>{
>  COSTS_N_INSNS (1), /* alu.  */
>  COSTS_N_INSNS (4)  /* mult.  */
> +COSTS_N_INSNS (1),  /* movi.  */
> +COSTS_N_INSNS (2),  /* dup.  */
> +COSTS_N_INSNS (2)   /* extract.  */
>}
>  };
>
> @@ -1294,6 +1297,9 @@ const struct cpu_cost_table cortexa8_extra_costs =
>{
>  COSTS_N_INSNS (1), /* alu.  */
>  COSTS_N_INSNS (4)  /* mult.  */
> +COSTS_N_INSNS (1),  /* movi.  */
> +COSTS_N_INSNS (2),  /* dup.  */
> +COSTS_N_INSNS (2)   /* extract.  */
>}
>  };
>
> @@ -1399,6 +1405,9 @@ const struct cpu_cost_table cortexa5_extra_costs =
>{
>  COSTS_N_INSNS (1), /* alu.  */
>  COSTS_N_INSNS (4)  /* mult.  */
> +COSTS_N_INSNS (1),  /* movi.  */
> +COSTS_N_INSNS (2),  /* dup.  */
> +COSTS_N_INSNS (2)   /* extract.  */
>}
>  };
>
> @@ -1505,6 +1514,9 @@ const struct cpu_cost_table cortexa7_extra_costs =
>{
>  COSTS_N_INSNS (1), /* alu.  */
>  COSTS_N_INSNS (4)  /* mult.  */
> +COSTS_N_INSNS (1),  /* movi.  */
> +COSTS_N_INSNS (2),  /* dup.  */
> +COSTS_N_INSNS (2)   /* extract.  */
>}
>  };
>
> @@ -1609,6 +1621,9 @@ const struct cpu_cost_table cortexa12_extra_costs =
>{
>  COSTS_N_INSNS (1), /* alu.  */
>  COSTS_N_INSNS (4)  /* mult.  */
> +COSTS_N_INSNS (1),  /* movi.  */
> +COSTS_N_INSNS (2),  /* dup.  */
> +COSTS_N_INSNS (2)   /* extract.  */
>}
>  };
>
> @@ -1713,6 +1728,9 @@ const struct cpu_cost_table cortexa15_extra_costs =
>{
>  COSTS_N_INSNS (1), /* alu.  */
>  COSTS_N_INSNS (4)  /* mult.  */
> +COSTS_N_INSNS (1),  /* movi.  */
> +COSTS_N_INSNS (2),  /* dup.  */
> +COSTS_N_INSNS (2)   /* extract.  */
>}
>  };
>
> @@ -1817,6 +1835,9 @@ const struct cpu_cost_table v7m_extra_costs =
>{
>  COSTS_N_INSNS (1), /* alu.  */
>  COSTS_N_INSNS (4)  /* mult.  */
> +COSTS_N_INSNS (1),  /* movi.  */
> +COSTS_N_INSNS (2),  /* dup.  */
> +COSTS_N_INSNS (2)   /* extract.  */
>}
>  };
>
>
>
> --
>

Re: [Patch 1/8, Arm, AArch64, GCC] Refactor mbranch-protection option parsing and make it common to AArch32 and AArch64 backends. [Was RE: [Patch 2/7, Arm, GCC] Add option -mbranch-protection.]

2021-11-10 Thread Andrea Corallo via Gcc-patches

Tejas Belagod via Gcc-patches  writes:

[...]

> This change refactors all the mbranch-protection option parsing code and types
> to make it common to both AArch32 and AArch64 backends.  This change also 
> pulls
> in some supporting types from AArch64 to make it common
> (aarch_parse_opt_result).  The significant changes in this patch are the
> movement of all branch protection parsing routines from aarch64.c to
> aarch-common.c and supporting data types and static data structures.  This
> patch also pre-declares variables and types required in the aarch32 back for
> moved variables for function sign scope and key to prepare for the impending
> series of patches that support parsing the feature mbranch-protection in the
> aarch32 back end.
>
> 2021-10-25  Tejas Belagod  
>
> gcc/ChangeLog:
>
>   * common/config/aarch64/aarch64-common.c: Include aarch-common.h.
>   (all_architectures): Fix comment.
>   (aarch64_parse_extension): Rename return type, enum value names.
>   * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Rename
>   factored out aarch_ra_sign_scope and aarch_ra_sign_key variables.
>   Also rename corresponding enum values.
>   * config/aarch64/aarch64-opts.h (aarch64_function_type): Factor out
>   aarch64_function_type and move it to common code as aarch_function_type
>   in aarch-common.h.
>   * config/aarch64/aarch64-protos.h: Include common types header, move out
>   types aarch64_parse_opt_result and aarch64_key_type to aarch-common.h
>   * config/aarch64/aarch64.c: Move mbranch-protection parsing types and
>   functions out into aarch-common.h and aarch-common.c.  Fix up all the 
> name
>   changes resulting from the move.
>   * config/aarch64/aarch64.md: Fix up aarch64_ra_sign_key type name change
>   and enum value.
>   * config/aarch64/aarch64.opt: Include aarch-common.h to import type 
> move.
>   Fix up name changes from factoring out common code and data.
>   * config/arm/aarch-common-protos.h: Export factored out routines to both
>   backends.
>   * config/arm/aarch-common.c: Include newly factored out types.  Move all
>   mbranch-protection code and data structures from aarch64.c.
>   * config/arm/aarch-common.h: New header that declares types shared 
> between
>   aarch32 and aarch64 backends.
>   * config/arm/arm-protos.h: Declare types and variables that are made 
> common
>   to aarch64 and aarch32 backends - aarch_ra_sign_key, 
> aarch_ra_sign_scope and
>   aarch_enable_bti.
>
>
> Tested the following configurations. OK for trunk?
>
> -mthumb/-march=armv8.1-m.main+pacbti/-mfloat-abi=soft
> -marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp
> mcmodel=small and tiny
> aarch64-none-linux-gnu native test and bootstrap
>
> Thanks,
> Tejas.

Hi Tejas,

going through the code I've spotted a couple of indentation nits that I
guess are coming from the original source that was moved.

> diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c

[...]

> +  /* Copy the last processed token into the argument to pass it back.
> +Used by option and attribute validation to print the offending token.  */
> +  if (last_str)
> +{
> +  if (str) strcpy (*last_str, str);
> +  else *last_str = NULL;

I think we should have new lines after both if and else here.

> +}
> +  if (res == AARCH_PARSE_OK)
> +{
> +  /* If needed, alloc the accepted string then copy in const_str.
> + Used by override_option_after_change_1.  */
> +  if (!accepted_branch_protection_string)
> + accepted_branch_protection_string = (char *) xmalloc (
> +   BRANCH_PROTECT_STR_MAX
> + + 1);
^^
Indentation


> +  strncpy (accepted_branch_protection_string, const_str,
> + BRANCH_PROTECT_STR_MAX + 1);
^^
Same
> +  /* Forcibly null-terminate.  */
> +  accepted_branch_protection_string[BRANCH_PROTECT_STR_MAX] = '\0';
> +}
> +  return res;
> +}

Thanks

  Andrea

Re: [PATCH][GCC] arm: enable cortex-a710 CPU

2021-11-10 Thread Przemyslaw Wirkus via Gcc-patches

> > Hi,
> >
> > This patch is adding support for Cortex-A710 CPU [0].
> >
> >   [0] https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a710
> >
> > OK for master?

> Ok.
> Thanks,
> Kyrill

commit 9701f153f6dfcc365ac0d96cdcf7df69a2de81dc


> >
> > gcc/ChangeLog:
> >
> >  * config/arm/arm-cpus.in (cortex-a710): New CPU.
> >  * config/arm/arm-tables.opt: Regenerate.
> >  * config/arm/arm-tune.md: Regenerate.
> >  * doc/invoke.texi: Update docs.
> >
> > --
> > kind regards,
> > Przemyslaw Wirkus
> >
> > Staff Compiler Engineer | Arm
> > . . . . . . . . . . . . . . . . . . . . . . . . . .
> >
> > Arm.com

[PATCH v2] powerpc: Remove LINK_OS_EXTRA_SPEC{32, 64} from --with-advance-toolchain

2021-11-10 Thread Lucas A. M. Magalhaes via Gcc-patches

Historically this was added to fill gaps from ld.so.cache on early AT
releases. This now are just causing errors and rework. Since AT5.0 the
AT's ld.so is using a correctly configured ld.so.cache and sets the
DT_INTERP to AT's ld.so. This two factors are sufficient for an AT
builded program to get the correct libraries.

GCC congured with --with-advance-toolchain has issues building GlibC
releases because it adds DT_RUNPATH to ld.so and that's unsupported.

2021-11-10  Lucas A. M. Magalhães  

gcc/
* config.gcc (powerpc*-*-*): Remove -rpath from
--with-advance-toochain
---
 gcc/config.gcc | 10 --
 1 file changed, 10 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index fb1f06f3da8..9eba3ece0a9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -5088,16 +5088,6 @@ case "${target}" in
(at="/opt/$with_advance_toolchain"
 echo "/* Use Advance Toolchain $at */"
 echo
-echo "#undef  LINK_OS_EXTRA_SPEC32"
-echo "#define LINK_OS_EXTRA_SPEC32" \
- "\"%(link_os_new_dtags)" \
- "-rpath $prefix/lib -rpath $at/lib\""
-echo
-echo "#undef  LINK_OS_EXTRA_SPEC64"
-echo "#define LINK_OS_EXTRA_SPEC64" \
- "\"%(link_os_new_dtags)" \
- "-rpath $prefix/lib64 -rpath $at/lib64\""
-echo
 echo "#undef  LINK_OS_NEW_DTAGS_SPEC"
 echo "#define LINK_OS_NEW_DTAGS_SPEC" \
  "\"--enable-new-dtags\""
-- 
2.31.1

Re: [PATCH v2] powerpc: Remove LINK_OS_EXTRA_SPEC{32, 64} from --with-advance-toolchain

2021-11-10 Thread Segher Boessenkool

On Wed, Nov 10, 2021 at 11:21:26AM -0300, Lucas A. M. Magalhaes wrote:
> Historically this was added to fill gaps from ld.so.cache on early AT
> releases. This now are just causing errors and rework. Since AT5.0 the
> AT's ld.so is using a correctly configured ld.so.cache and sets the
> DT_INTERP to AT's ld.so. This two factors are sufficient for an AT
> builded program to get the correct libraries.
> 
> GCC congured with --with-advance-toolchain has issues building GlibC
> releases because it adds DT_RUNPATH to ld.so and that's unsupported.
> 
> 2021-11-10  Lucas A. M. Magalhães  
> 
> gcc/
>   * config.gcc (powerpc*-*-*): Remove -rpath from
>   --with-advance-toochain

I fixed the title and this last line, and pushed it to trunk.  Thanks!


Segher

Re: [PATCH] powerpc: Remove LINK_OS_EXTRA_SPEC{32, 64} from --with-advance-toolchain

2021-11-10 Thread Segher Boessenkool

On Tue, Nov 09, 2021 at 04:03:55PM -0300, Lucas A. M. Magalhaes wrote:
> Quoting Segher Boessenkool (2021-11-09 11:19:58)
> > On Tue, Nov 09, 2021 at 10:39:46AM -0300, Lucas A. M. Magalhaes wrote:
> > > Ping.
> > 
> > I did not get the original, and neither did the archives?
> > 
> Strange, it's on the archives.
> https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582643.html
> Looking at my local mails, I did't receive it as well.

Ah.  Because your changelog suggested this patch is from November, I
didn't look at earlier archives :-)


Segher

Silence additional warning in gfortran.dg/do_subscript_3.f90

2021-11-10 Thread Jan Hubicka via Gcc-patches

Hi,
the testcase tests for out of bound accesses warnings and with ipa-modref 
improvements
it now triggers a new warning:

/aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:11:9: 
Warning: (1)
/aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:10:47: 
Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2)
/aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:19:9: 
Warning: (1)
/aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:18:45: 
Warning: Array reference at (1) out of bounds (6 > 5) in loop beginning at (2)
/aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:19:50: 
Warning: iteration 5 invokes undefined behavior 
[-Waggressive-loop-optimizations]
/aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:18:9: note: 
within this loop

I suppose we now are able to propagate array bounds better into the
nested function.

The last warning is new and correct even though little bit redundant.  I think
we may just silence it?  I wonder why we do not get same fact on the first loop
(which hits out of bound access already at iteration 0).

Looks OK?
Honza

gcc/testsuite/ChangeLog:

2021-11-10  Jan Hubicka  

* gfortran.dg/do_subscript_3.f90: Add 
-Wno-aggressive-loop-optimizations.

diff --git a/gcc/testsuite/gfortran.dg/do_subscript_3.f90 
b/gcc/testsuite/gfortran.dg/do_subscript_3.f90
index 2f62f58142b..18ed9a2f0c9 100644
--- a/gcc/testsuite/gfortran.dg/do_subscript_3.f90
+++ b/gcc/testsuite/gfortran.dg/do_subscript_3.f90
@@ -1,4 +1,5 @@
 ! { dg-do compile }
+! { dg-additional-options "-Wno-aggressive-loop-optimizations" }
 ! PR fortran/91424
 ! Check that only one warning is issued inside blocks, and that
 ! warnings are also issued for contained subroutines.

RE: [PATCH] arm: Initialize vector costing fields

2021-11-10 Thread Kyrylo Tkachov via Gcc-patches

Hi Christophe

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: Monday, November 8, 2021 6:13 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH] arm: Initialize vector costing fields
> 
> The movi, dup and extract costing fields were recently added to struct
> vector_cost_table, but there initialization is missing for the arm
> (aarch32) specific descriptions.
> 
> Although the arm port does not use these fields (only aarch64 does),
> this is causing warnings during the build, and even build failures
> when using gcc-4.8.5 as host compiler:
> 
> /gccsrc/gcc/config/arm/arm.c:1194:1: error: uninitialized const member
> 'vector_cost_table::movi'
>  };
>   ^
> /gccsrc/gcc/config/arm/arm.c:1194:1: warning: missing initializer for member
> 'vector_cost_table::movi' [-Wmissing-field-initializers]
> /gccsrc/gcc/config/arm/arm.c:1194:1: error: uninitialized const member
> 'vector_cost_table::dup'
> /gccsrc/gcc/config/arm/arm.c:1194:1: warning: missing initializer for member
> 'vector_cost_table::dup' [-Wmissing-field-initializers]
> /gccsrc/gcc/config/arm/arm.c:1194:1: error: uninitialized const member
> 'vector_cost_table::extract'
> /gccsrc/gcc/config/arm/arm.c:1194:1: warning: missing initializer for member
> 'vector_cost_table::extract' [-Wmissing-field-initializers]
> 
> This patch uses the same initialization values as in aarch64 for
> consistency:
> +COSTS_N_INSNS (1),  /* movi.  */
> +COSTS_N_INSNS (2),  /* dup.  */
> +COSTS_N_INSNS (2)   /* extract.  */
> 
> But given these fields are not used, maybe a dummy value should be
> used instead? (zero?)

They're dummy values for now, but there's no reason why the backend couldn't be 
extended to use them in the future.
Anyway, this patch is okay as is.

Thanks,
Kyrill

> 
> 2021-11-08  Christophe Lyon  
> 
>   gcc/
>   * config/arm/arm.c (cortexa9_extra_costs, cortexa8_extra_costs,
>   cortexa5_extra_costs, cortexa7_extra_costs,
>   cortexa12_extra_costs, cortexa15_extra_costs, v7m_extra_costs):
>   Initialize movi, dup and extract costing fields.
> ---
>  gcc/config/arm/arm.c | 35 ---
>  1 file changed, 28 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 6c6e77fab66..3f5e1162853 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -1197,7 +1197,10 @@ const struct cpu_cost_table cortexa9_extra_costs
> =
>/* Vector */
>{
>  COSTS_N_INSNS (1),   /* alu.  */
> -COSTS_N_INSNS (4)/* mult.  */
> +COSTS_N_INSNS (4),   /* mult.  */
> +COSTS_N_INSNS (1),   /* movi.  */
> +COSTS_N_INSNS (2),   /* dup.  */
> +COSTS_N_INSNS (2)/* extract.  */
>}
>  };
> 
> @@ -1301,7 +1304,10 @@ const struct cpu_cost_table cortexa8_extra_costs
> =
>/* Vector */
>{
>  COSTS_N_INSNS (1),   /* alu.  */
> -COSTS_N_INSNS (4)/* mult.  */
> +COSTS_N_INSNS (4),   /* mult.  */
> +COSTS_N_INSNS (1),   /* movi.  */
> +COSTS_N_INSNS (2),   /* dup.  */
> +COSTS_N_INSNS (2)/* extract.  */
>}
>  };
> 
> @@ -1406,7 +1412,10 @@ const struct cpu_cost_table cortexa5_extra_costs
> =
>/* Vector */
>{
>  COSTS_N_INSNS (1),   /* alu.  */
> -COSTS_N_INSNS (4)/* mult.  */
> +COSTS_N_INSNS (4),   /* mult.  */
> +COSTS_N_INSNS (1),   /* movi.  */
> +COSTS_N_INSNS (2),   /* dup.  */
> +COSTS_N_INSNS (2)/* extract.  */
>}
>  };
> 
> @@ -1512,7 +1521,10 @@ const struct cpu_cost_table cortexa7_extra_costs
> =
>/* Vector */
>{
>  COSTS_N_INSNS (1),   /* alu.  */
> -COSTS_N_INSNS (4)/* mult.  */
> +COSTS_N_INSNS (4),   /* mult.  */
> +COSTS_N_INSNS (1),   /* movi.  */
> +COSTS_N_INSNS (2),   /* dup.  */
> +COSTS_N_INSNS (2)/* extract.  */
>}
>  };
> 
> @@ -1616,7 +1628,10 @@ const struct cpu_cost_table
> cortexa12_extra_costs =
>/* Vector */
>{
>  COSTS_N_INSNS (1),   /* alu.  */
> -COSTS_N_INSNS (4)/* mult.  */
> +COSTS_N_INSNS (4),   /* mult.  */
> +COSTS_N_INSNS (1),   /* movi.  */
> +COSTS_N_INSNS (2),   /* dup.  */
> +COSTS_N_INSNS (2)/* extract.  */
>}
>  };
> 
> @@ -1720,7 +1735,10 @@ const struct cpu_cost_table
> cortexa15_extra_costs =
>/* Vector */
>{
>  COSTS_N_INSNS (1),   /* alu.  */
> -COSTS_N_INSNS (4)/* mult.  */
> +COSTS_N_INSNS (4),   /* mult.  */
> +COSTS_N_INSNS (1),   /* movi.  */
> +COSTS_N_INSNS (2),   /* dup.  */
> +COSTS_N_INSNS (2)/* extract.  */
>}
>  };
> 
> @@ -1824,7 +1842,10 @@ const struct cpu_cost_table v7m_extra_costs =
>/* Vector */
>{
>  COSTS_N_INSNS (1),   /* alu.  */
> -COSTS_N_INSNS (4)/* mult.  */
> +COSTS_N_INSNS (4),

[COMMITTED] Grow sbr_vector in ranger's on-entry cache as needed.

2021-11-10 Thread Aldy Hernandez via Gcc-patches

From: Andrew MacLeod 

The on-entry cache does not expect the number of BBs to change.  This
could happen in various scenarios, recently in the suggestion to use
ranger with loop unswitching and also with a work in progress to use
the path solver in the loopch pass.  This patch fixes both.

This is a patch from Andrew, who tested it on x86-64 Linux.

Pushed.

gcc/ChangeLog:

* gimple-range-cache.cc (sbr_vector::grow): New.
(sbr_vector::set_bb_range): Call grow.
(sbr_vector::get_bb_range): Same.
(sbr_vector::bb_range_p): Remove assert.
---
 gcc/gimple-range-cache.cc | 35 +++
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index e5591bab0ef..a63e20e7e49 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -210,6 +210,7 @@ protected:
   int_range<2> m_undefined;
   tree m_type;
   irange_allocator *m_irange_allocator;
+  void grow ();
 };
 
 
@@ -229,13 +230,37 @@ sbr_vector::sbr_vector (tree t, irange_allocator 
*allocator)
   m_undefined.set_undefined ();
 }
 
+// Grow the vector when the CFG has increased in size.
+
+void
+sbr_vector::grow ()
+{
+  int curr_bb_size = last_basic_block_for_fn (cfun);
+  gcc_checking_assert (curr_bb_size > m_tab_size);
+
+  // Increase the max of a)128, b)needed increase * 2, c)10% of current_size.
+  int inc = MAX ((curr_bb_size - m_tab_size) * 2, 128);
+  inc = MAX (inc, curr_bb_size / 10);
+  int new_size = inc + curr_bb_size;
+
+  // Allocate new memory, copy the old vector and clear the new space.
+  irange **t = (irange **)m_irange_allocator->get_memory (new_size
+ * sizeof (irange *));
+  memcpy (t, m_tab, m_tab_size * sizeof (irange *));
+  memset (t + m_tab_size, 0, (new_size - m_tab_size) * sizeof (irange *));
+
+  m_tab = t;
+  m_tab_size = new_size;
+}
+
 // Set the range for block BB to be R.
 
 bool
 sbr_vector::set_bb_range (const_basic_block bb, const irange &r)
 {
   irange *m;
-  gcc_checking_assert (bb->index < m_tab_size);
+  if (bb->index >= m_tab_size)
+grow ();
   if (r.varying_p ())
 m = &m_varying;
   else if (r.undefined_p ())
@@ -252,7 +277,8 @@ sbr_vector::set_bb_range (const_basic_block bb, const 
irange &r)
 bool
 sbr_vector::get_bb_range (irange &r, const_basic_block bb)
 {
-  gcc_checking_assert (bb->index < m_tab_size);
+  if (bb->index >= m_tab_size)
+return false;
   irange *m = m_tab[bb->index];
   if (m)
 {
@@ -267,8 +293,9 @@ sbr_vector::get_bb_range (irange &r, const_basic_block bb)
 bool
 sbr_vector::bb_range_p (const_basic_block bb)
 {
-  gcc_checking_assert (bb->index < m_tab_size);
-  return m_tab[bb->index] != NULL;
+  if (bb->index < m_tab_size)
+return m_tab[bb->index] != NULL;
+  return false;
 }
 
 // This class implements the on entry cache via a sparse bitmap.
-- 
2.31.1

RE: [PATCH v4 1/1] [ARM] Add support for TLS register based stack protector canary access

2021-11-10 Thread Kyrylo Tkachov via Gcc-patches

Hi Ard,

Thanks for working on this, comments inline.

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Ard
> Biesheuvel via Gcc-patches
> Sent: Thursday, October 28, 2021 12:27 PM
> To: linux-harden...@vger.kernel.org
> Cc: keesc...@chromium.org; Richard Sandiford
> ; thomas.preudho...@celest.fr; Keith
> Packard ; gcc-patches@gcc.gnu.org; Ard Biesheuvel
> 
> Subject: [PATCH v4 1/1] [ARM] Add support for TLS register based stack
> protector canary access
> 
> Add support for accessing the stack canary value via the TLS register,
> so that multiple threads running in the same address space can use
> distinct canary values. This is intended for the Linux kernel running in
> SMP mode, where processes entering the kernel are essentially threads
> running the same program concurrently: using a global variable for the
> canary in that context is problematic because it can never be rotated,
> and so the OS is forced to use the same value as long as it remains up.
> 
> Using the TLS register to index the stack canary helps with this, as it
> allows each CPU to context switch the TLS register along with the rest
> of the process, permitting each process to use its own value for the
> stack canary.
> 
> 2021-10-28 Ard Biesheuvel 
> 
>   * config/arm/arm-opts.h (enum stack_protector_guard): New
>   * config/arm/arm-protos.h (arm_stack_protect_tls_canary_mem):
>   New
>   * config/arm/arm.c (TARGET_STACK_PROTECT_GUARD): Define
>   (arm_option_override_internal): Handle and put in error checks
>   for stack protector guard options.
>   (arm_option_reconfigure_globals): Likewise
>   (arm_stack_protect_tls_canary_mem): New
>   (arm_stack_protect_guard): New
>   * config/arm/arm.md (stack_protect_set): New
>   (stack_protect_set_tls): Likewise
>   (stack_protect_test): Likewise
>   (stack_protect_test_tls): Likewise
>   (reload_tp_hard): Likewise
>   * config/arm/arm.opt (-mstack-protector-guard): New
>   (-mstack-protector-guard-offset): New.
>   * doc/invoke.texi: Document new options
> 

How has this been tested? The code looks mostly okay to me, but the rules for 
patches require a bootstrap and run of the testsuite:
https://gcc.gnu.org/contribute.html#testing
If you don't have access to an arm machine, the GCC compile farm may be of use: 
https://gcc.gnu.org/wiki/CompileFarm

In terms of tests, like Qing says we'd like to see some additions to the 
testsuite.
These would go into the testsuite/gcc.target/arm directory.
You can grep for "mstack-protector-guard" in the testsuite/ directory to see 
how various targets test this functionality and copy/adapt some tests for arm.

> Signed-off-by: Ard Biesheuvel 
> ---
>  gcc/config/arm/arm-opts.h   |  6 ++
>  gcc/config/arm/arm-protos.h |  2 +
>  gcc/config/arm/arm.c| 55 +++
>  gcc/config/arm/arm.md   | 71 +++-
>  gcc/config/arm/arm.opt  | 22 ++
>  gcc/doc/invoke.texi |  9 +++
>  6 files changed, 163 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/arm/arm-opts.h b/gcc/config/arm/arm-opts.h
> index 5c4b62f404f7..581ba3c4fbbb 100644
> --- a/gcc/config/arm/arm-opts.h
> +++ b/gcc/config/arm/arm-opts.h
> @@ -69,4 +69,10 @@ enum arm_tls_type {
>TLS_GNU,
>TLS_GNU2
>  };
> +
> +/* Where to get the canary for the stack protector.  */
> +enum stack_protector_guard {
> +  SSP_TLSREG,  /* per-thread canary in TLS register */
> +  SSP_GLOBAL   /* global canary */
> +};
>  #endif
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 9b1f61394ad7..d8d605920c97 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -195,6 +195,8 @@ extern void arm_split_atomic_op (enum rtx_code,
> rtx, rtx, rtx, rtx, rtx, rtx);
>  extern rtx arm_load_tp (rtx);
>  extern bool arm_coproc_builtin_available (enum unspecv);
>  extern bool arm_coproc_ldc_stc_legitimate_address (rtx);
> +extern rtx arm_stack_protect_tls_canary_mem (bool);
> +
> 
>  #if defined TREE_CODE
>  extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree);
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index c4ff06b087eb..6a659d81a6fe 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -829,6 +829,9 @@ static const struct attribute_spec
> arm_attribute_table[] =
> 
>  #undef TARGET_MD_ASM_ADJUST
>  #define TARGET_MD_ASM_ADJUST arm_md_asm_adjust
> +
> +#undef TARGET_STACK_PROTECT_GUARD
> +#define TARGET_STACK_PROTECT_GUARD arm_stack_protect_guard
> 
> 
> 
>  /* Obstack for minipool constant handling.  */
>  static struct obstack minipool_obstack;
> @@ -3155,6 +3158,26 @@ arm_option_override_internal (struct
> gcc_options *opts,
>if (TARGET_THUMB2_P (opts->x_target_flags))
>  opts->x_inline_asm_unified = true;
> 
> +  if (arm_stack_protector_guard == SSP_GLOBAL
> +  && opts->x_arm_stack_protector_g

Re: [PATCH] PR middle-end/103059: reload: Also accept ASHIFT with indexed addressing

2021-11-10 Thread Maciej W. Rozycki

On Mon, 8 Nov 2021, Jeff Law wrote:

> >   Well, the context of this code (around and including hunk #1) is:
> > 
> >else if (insn_extra_address_constraint
> >(lookup_constraint (constraints[i])))
> > {
> >   address_operand_reloaded[i]
> > = find_reloads_address (recog_data.operand_mode[i], (rtx*) 0,
> > recog_data.operand[i],
> > recog_data.operand_loc[i],
> > i, operand_type[i], ind_levels, insn);
> > 
> >   /* If we now have a simple operand where we used to have a
> >  PLUS or MULT, re-recognize and try again.  */
> >   if ((OBJECT_P (*recog_data.operand_loc[i])
> >|| GET_CODE (*recog_data.operand_loc[i]) == SUBREG)
> >   && (GET_CODE (recog_data.operand[i]) == MULT
> >   || GET_CODE (recog_data.operand[i]) == PLUS))
> > {
> >   INSN_CODE (insn) = -1;
> >   retval = find_reloads (insn, replace, ind_levels, live_known,
> >  reload_reg_p);
> >   return retval;
> > }
> > 
> > so the body of the conditional is specifically executed for an address and
> > not a MEM; in this particular case matched with the plain "p" constraint.
> > 
> >   MEMs are handled with the next conditional right below.
> Ah!  Thanks for the clarification.  We're digging deep into history here.  I
> always thought this code was re-recognizing inside a MEM, but as you note, 
> it's
> actually handling stuff outside the MEM, such as  a 'p' constraint, which is 
> an
> address, but being outside a MEMS means its not subject to the
> mult-by-power-of-2 canonicalization.
> 
> So I think the first hunk is fine.  There's two others that twiddle
> find_reloads_address_1, which I think can only be reached from
> find_reloads_address.  The comment at the front would indicate it's only
> called where AD is inside a MEM.

 It's actually hunk #2 that fixes this specific ICE.  The other two are 
just a consequence: #3 just being a commutative variant of the same case 
and #1 from observing that the rtx may now have changed if an ASHIFT too.

> Are we getting into find_reloads_address_1 in any case where the RTL is not an
> address inside a MEM?

 I've had a GDB session left open with the problematic source, so it was 
merely a case of a rerun and grabbing some data.  So with a breakpoint set 
at reload.c:5565, conditionalised on (code0 == ASHIFT || code1 == ASHIFT), 
we get exactly this, as with my change description:

Breakpoint 52, find_reloads_address_1 (mode=E_DImode, as=0 '\000', 
x=0x7fffedbaf7b0, context=0, outer_code=MEM, index_code=SCRATCH, 
loc=0x761a82f0, opnum=1, type=RELOAD_FOR_INPUT, ind_levels=1, 
insn=0x7fffefc1c9c0) at .../gcc/reload.c:5565
5565if (code0 == MULT || code0 == SIGN_EXTEND || code0 == TRUNCATE
(gdb) print code0
$12958 = ASHIFT
(gdb) print code1
$12959 = PLUS
(gdb) print outer_code
$12960 = MEM
(gdb) pr insn
(insn 2051 2050 2052 180 (set (reg/f:SI 0 %r0 [555])
(plus:SI (ashift:SI (reg/v:SI 154 [ n_ctrs ])
(const_int 3 [0x3]))
(plus:SI (reg/v/f:SI 9 %r9 [orig:176 fn_buffer ] [176])
(const_int 24 [0x18] ".../libgcc/libgcov-driver.c":172:40 
614 {movaddrdi}
 (nil))
(gdb) pr x
(plus:SI (ashift:SI (reg/v:SI 154 [ n_ctrs ])
(const_int 3 [0x3]))
(plus:SI (reg/v/f:SI 9 %r9 [orig:176 fn_buffer ] [176])
(const_int 24 [0x18])))
(gdb) bt
#0  find_reloads_address_1 (mode=E_DImode, as=0 '\000', x=0x7fffedbaf7b0, 
context=0, outer_code=MEM, index_code=SCRATCH, loc=0x761a82f0, opnum=1, 
type=RELOAD_FOR_INPUT, ind_levels=1, insn=0x7fffefc1c9c0) at 
.../gcc/reload.c:5565
#1  0x111ecd18 in find_reloads_address (mode=E_DImode, memrefloc=0x0, 
ad=0x7fffedbaf7b0, loc=0x761a82f0, opnum=1, type=RELOAD_FOR_INPUT, 
ind_levels=1, insn=0x7fffefc1c9c0) at .../gcc/reload.c:5264
#2  0x111e2fbc in find_reloads (insn=0x7fffefc1c9c0, replace=1, 
ind_levels=1, live_known=1, reload_reg_p=0x12ec7770 ) at 
.../gcc/reload.c:2843
#3  0x112060f4 in reload_as_needed (live_known=1) at 
.../gcc/reload1.c:4522
#4  0x111f9008 in reload (first=0x75dd3c28, global=1) at 
.../gcc/reload1.c:1047
#5  0x10f1458c in do_reload () at .../gcc/ira.c:5944
#6  0x10f14d54 in (anonymous namespace)::pass_reload::execute 
(this=0x12f21d20) at .../gcc/ira.c:6118
#7  0x1112472c in execute_one_pass (pass=0x12f21d20) at 
.../gcc/passes.c:2567
#8  0x11124bc4 in execute_pass_list_1 (pass=0x12f21d20) at 
.../gcc/passes.c:2656
#9  0x11124c0c in execute_pass_list_1 (pass=0x12f20b80) at 
.../gcc/passes.c:2657
#10 0x11124cac in execute_pass_list (fn=0x75dc4b00, 
pass=0x12f1c900) at .../gcc/passes.c:2667
#11 0x109b64f4 in cgraph_node::expand (this=0x75d65a50) at 
.../gcc/cgraphunit.c:1828
#12 0x109b6eac in expand_all_functions () at .../

[COMMITTED] path solver: Adjustments for use outside of the backward threader.

2021-11-10 Thread Aldy Hernandez via Gcc-patches

Here are some enhancements to make it easier for other clients to use
the path solver.

First, I've made the imports to the solver optional since we can
calculate them ourselves.  However, I've left the ability to set them,
since the backward threader adds a few SSA names in addition to the
default ones.  As a follow-up I may move all the import set up code
from the threader to the solver, as the extra imports tend to improve
the behavior slightly.

Second, Richi suggested an entry point where you just feed the solver
an edge, which will be quite convenient for a subsequent patch adding
a client in the header copying pass.  The required some shuffling,
since we'll be adding the blocks on the fly.  There's now a vector
copy, but the impact will be minimal, since these are just 5-6 entries
at the most.

Tested on ppc64le Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::path_range_query): Do
not init m_path.
(path_range_query::dump): Change m_path uses to non-pointer.
(path_range_query::defined_outside_path):  Same.
(path_range_query::set_path): Same.
(path_range_query::add_copies_to_imports): Same.
(path_range_query::range_of_stmt): Same.
(path_range_query::compute_outgoing_relations): Same.
(path_range_query::compute_ranges): Imports are now optional.
Implement overload that takes an edge.
* gimple-range-path.h (class path_range_query): Make imports
optional for compute_ranges.  Add compute_ranges(edge) overload.
Make m_path an auto_vec instead of a pointer and adjust
accordingly.
---
 gcc/gimple-range-path.cc | 41 +---
 gcc/gimple-range-path.h  | 17 +
 2 files changed, 39 insertions(+), 19 deletions(-)

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index 99ac947581b..6da01c7067f 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -41,7 +41,6 @@ path_range_query::path_range_query (gimple_ranger &ranger, 
bool resolve)
 {
   m_cache = new ssa_global_cache;
   m_has_cache_entry = BITMAP_ALLOC (NULL);
-  m_path = NULL;
   m_resolve = resolve;
   m_oracle = new path_oracle (ranger.oracle ());
 }
@@ -92,13 +91,13 @@ path_range_query::dump (FILE *dump_file)
 {
   push_dump_file save (dump_file, dump_flags & ~TDF_DETAILS);
 
-  if (m_path->is_empty ())
+  if (m_path.is_empty ())
 return;
 
   unsigned i;
   bitmap_iterator bi;
 
-  dump_ranger (dump_file, *m_path);
+  dump_ranger (dump_file, m_path);
 
   fprintf (dump_file, "Imports:\n");
   EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
@@ -125,7 +124,7 @@ path_range_query::defined_outside_path (tree name)
   gimple *def = SSA_NAME_DEF_STMT (name);
   basic_block bb = gimple_bb (def);
 
-  return !bb || !m_path->contains (bb);
+  return !bb || !m_path.contains (bb);
 }
 
 // Return the range of NAME on entry to the path.
@@ -230,8 +229,8 @@ void
 path_range_query::set_path (const vec &path)
 {
   gcc_checking_assert (path.length () > 1);
-  m_path = &path;
-  m_pos = m_path->length () - 1;
+  m_path = path.copy ();
+  m_pos = m_path.length () - 1;
   bitmap_clear (m_has_cache_entry);
 }
 
@@ -486,7 +485,7 @@ path_range_query::add_copies_to_imports ()
  tree arg = gimple_phi_arg (phi, i)->def;
 
  if (TREE_CODE (arg) == SSA_NAME
- && m_path->contains (e->src)
+ && m_path.contains (e->src)
  && bitmap_set_bit (m_imports, SSA_NAME_VERSION (arg)))
worklist.safe_push (arg);
}
@@ -497,7 +496,8 @@ path_range_query::add_copies_to_imports ()
 // Compute the ranges for IMPORTS along PATH.
 //
 // IMPORTS are the set of SSA names, any of which could potentially
-// change the value of the final conditional in PATH.
+// change the value of the final conditional in PATH.  Default to the
+// imports of the last block in the path if none is given.
 
 void
 path_range_query::compute_ranges (const vec &path,
@@ -507,9 +507,16 @@ path_range_query::compute_ranges (const vec 
&path,
 fprintf (dump_file, "\n==\n");
 
   set_path (path);
-  bitmap_copy (m_imports, imports);
   m_undefined_path = false;
 
+  if (imports)
+bitmap_copy (m_imports, imports);
+  else
+{
+  bitmap imports = m_ranger.gori ().imports (exit_bb ());
+  bitmap_copy (m_imports, imports);
+}
+
   if (m_resolve)
 {
   add_copies_to_imports ();
@@ -561,6 +568,18 @@ path_range_query::compute_ranges (const vec 
&path,
 }
 }
 
+// Convenience function to compute ranges along a path consisting of
+// E->SRC and E->DEST.
+
+void
+path_range_query::compute_ranges (edge e)
+{
+  auto_vec bbs (2);
+  bbs.quick_push (e->dest);
+  bbs.quick_push (e->src);
+  compute_ranges (bbs);
+}
+
 // A folding aid used to register and query relations along a path.
 // When queried, it returns relations as they would appear on exit to
 // th

Re: [PATCH] vect: Remove vec_outside/inside_cost fields

2021-11-10 Thread Martin Liška


On 11/8/21 11:43, Richard Sandiford via Gcc-patches wrote:

|Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install?|


I think the patch causes the following on x86_64-linux-gnu:
FAIL: gfortran.dg/inline_matmul_17.f90   -O   scan-tree-dump-times optimized 
"matmul_r4" 2

Martin

[PATCH] c++: template-id ADL and partial instantiation [PR99911]

2021-11-10 Thread Patrick Palka via Gcc-patches

Here when partially instantiating the call get(T{}) with T=N::A
(for which earlier unqualified name lookup for 'get' found nothing)
the arguments after substitution are no longer dependent but the callee
still is, so perform_koenig_lookup postpones ADL.  But then we go on to
diagnose the unresolved template name anyway, as if ADL was already
performed and failed.

This patch fixes this by avoiding the error path in question when the
template arguments of an unresolved template-id are dependent, which
mirrors the dependence check in perform_koenig_lookup.  In passing, this
patch also disables the -fpermissive fallback that performs a second
unqualified lookup in the template-id ADL case; this fallback seems to be
intended for legacy code and shouldn't be used for C++20 template-id ADL.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 11?

PR c++/99911

gcc/cp/ChangeLog:

* pt.c (tsubst_copy_and_build) : Don't diagnose
name lookup failure if the arguments to an unresolved template
name are still dependent.  Disable the -fpermissive fallback for
template-id ADL.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/fn-template24.C: New test.
---
 gcc/cp/pt.c|  6 --
 gcc/testsuite/g++.dg/cpp2a/fn-template24.C | 16 
 2 files changed, 20 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/fn-template24.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 991a20a85d4..4beddf9caf8 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -20427,12 +20427,14 @@ tsubst_copy_and_build (tree t,
if (function != NULL_TREE
&& (identifier_p (function)
|| (TREE_CODE (function) == TEMPLATE_ID_EXPR
-   && identifier_p (TREE_OPERAND (function, 0
+   && identifier_p (TREE_OPERAND (function, 0))
+   && !any_dependent_template_arguments_p (TREE_OPERAND
+   (function, 1
&& !any_type_dependent_arguments_p (call_args))
  {
if (TREE_CODE (function) == TEMPLATE_ID_EXPR)
  function = TREE_OPERAND (function, 0);
-   if (koenig_p && (complain & tf_warning_or_error))
+   else if (koenig_p && (complain & tf_warning_or_error))
  {
/* For backwards compatibility and good diagnostics, try
   the unqualified lookup again if we aren't in SFINAE
diff --git a/gcc/testsuite/g++.dg/cpp2a/fn-template24.C 
b/gcc/testsuite/g++.dg/cpp2a/fn-template24.C
new file mode 100644
index 000..b444ac6a273
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/fn-template24.C
@@ -0,0 +1,16 @@
+// PR c++/99911
+// { dg-do compile { target c++20 } }
+
+namespace N {
+  struct A { };
+  template void get(A);
+};
+
+template
+auto f() {
+  return [](U) { get(T{}); };
+}
+
+int main() {
+  f()(0);
+}
-- 
2.34.0.rc1.14.g88d915a634

Re: [PATCH v4 1/1] [ARM] Add support for TLS register based stack protector canary access

2021-11-10 Thread Qing Zhao via Gcc-patches



> On Nov 9, 2021, at 4:02 PM, Ard Biesheuvel  wrote:
> 
> On Tue, 9 Nov 2021 at 21:45, Qing Zhao  wrote:
>> 
>> Hi, Ard,
>> 
>> Sorry for the late reply (since I don’t have the right to approve a patch, I 
>> has been waiting for any arm port maintainer to review this patch).
>> The following is the arm port maintainer information I got from MAINTAINERS 
>> file (you might want to explicitly cc’ing one of them for a review)
>> 
>> arm portNick Clifton
>> arm portRichard Earnshaw
>> arm portRamana Radhakrishnan
>> 
>> arm portKyrylo Tkachov  
>> 
>> I see that Ramana implemented the similar patch for aarch64 (commit 
>> cd0b2d361df82c848dc7e1c3078651bb0624c3c6), So, I am CCing him with this 
>> email. Hopefully he will review this patch.
>> 
> 
> Thank you Qing. But I know Ramana well, and I know he no longer works
> on GCC. I collaborated with him on the AArch64 implementation at the
> time (but he wrote all the code)

Good to know this. Then we might need to update MAINTAINERS file to reflect 
this fact.
 (i.e, delete Ramana from the arm port list). However, this change does not 
relate to your current patch.
> 
>> Anyway, I briefly read your patch (version 4), and have the following 
>> questions and comments:
>> 
>> 1.  When the option -mstack-protector-guard=tls presents,  should the option 
>> mstack-protector-guard-offset=.. be required to present?
>> If it’s required to present, you might want to add such requirement to 
>> the documentation, and also issue errors when it’s not present.
>> It’s not clear right now from the current implementation, so, you might 
>> need to update both "arm_option_override_internal “ in arm.c
>> and doc/invoke.texi to make this clear.
>> 
> 
> An  offset of 0x0 is a reasonable default, so I don't think it is
> necessary to require the offset param to be passed in that case.

then It might be good to make this  clear in the documentation. (invoke.texi 
file).

> 
>> 2. For arm, is there only one system register can be used for this purpose?
>> 
> 
> There are other registers that might be used in the same way, but the
> TLS register is the obvious choice. On AArch64, we decided to use
> 'sysreg' and permit the user to specify the register because the Linux
> kernel uses the user space stack pointer (SP_EL0), which is kind of
> odd so we did not want to hard code that.

Okay.

> 
>> 3. For the functionality you added, I didn’t see any testing cases added, I 
>> think testing cases are needed.
>> 
> 
> Yes, I am aware of that. I'm just not sure I know how to proceed here:
> any pointers?
Looks like that Kyrylo has provided you info on this part in the other mail.

> 
>> More comments are embedded below:
>> 
>>> On Oct 28, 2021, at 6:27 AM, Ard Biesheuvel  wrote:
>>> 
>>> Add support for accessing the stack canary value via the TLS register,
>>> so that multiple threads running in the same address space can use
>>> distinct canary values. This is intended for the Linux kernel running in
>>> SMP mode, where processes entering the kernel are essentially threads
>>> running the same program concurrently: using a global variable for the
>>> canary in that context is problematic because it can never be rotated,
>>> and so the OS is forced to use the same value as long as it remains up.
>>> 
>>> Using the TLS register to index the stack canary helps with this, as it
>>> allows each CPU to context switch the TLS register along with the rest
>>> of the process, permitting each process to use its own value for the
>>> stack canary.
>>> 
>>> 2021-10-28 Ard Biesheuvel 
>>> 
>>>  * config/arm/arm-opts.h (enum stack_protector_guard): New
>>>  * config/arm/arm-protos.h (arm_stack_protect_tls_canary_mem):
>>>  New
>>>  * config/arm/arm.c (TARGET_STACK_PROTECT_GUARD): Define
>>>  (arm_option_override_internal): Handle and put in error checks
>>>  for stack protector guard options.
>>>  (arm_option_reconfigure_globals): Likewise
>>>  (arm_stack_protect_tls_canary_mem): New
>>>  (arm_stack_protect_guard): New
>>>  * config/arm/arm.md (stack_protect_set): New
>>>  (stack_protect_set_tls): Likewise
>>>  (stack_protect_test): Likewise
>>>  (stack_protect_test_tls): Likewise
>>>  (reload_tp_hard): Likewise
>>>  * config/arm/arm.opt (-mstack-protector-guard): New
>>>  (-mstack-protector-guard-offset): New.
>>>  * doc/invoke.texi: Document new options
>>> 
>>> Signed-off-by: Ard Biesheuvel 
>>> ---
>>> gcc/config/arm/arm-opts.h   |  6 ++
>>> gcc/config/arm/arm-protos.h |  2 +
>>> gcc/config/arm/arm.c| 55 +++
>>> gcc/config/arm/arm.md   | 71 +++-
>>> gcc/config/arm/arm.opt  | 22 ++
>>> gcc/doc/invoke.texi |  9 +++
>>> 6 files changed, 163 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/gcc/config/arm/arm-opts.h b/gcc/config/arm/arm-opts.h
>>> index 5c4b62f404f7..

Re: [PATCH] arm: Initialize vector costing fields

2021-11-10 Thread Christophe Lyon via Gcc-patches

On Wed, Nov 10, 2021 at 4:34 PM Kyrylo Tkachov via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> Hi Christophe
>
> > -Original Message-
> > From: Gcc-patches  > bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> > Lyon via Gcc-patches
> > Sent: Monday, November 8, 2021 6:13 PM
> > To: gcc-patches@gcc.gnu.org
> > Subject: [PATCH] arm: Initialize vector costing fields
> >
> > The movi, dup and extract costing fields were recently added to struct
> > vector_cost_table, but there initialization is missing for the arm
> > (aarch32) specific descriptions.
> >
> > Although the arm port does not use these fields (only aarch64 does),
> > this is causing warnings during the build, and even build failures
> > when using gcc-4.8.5 as host compiler:
> >
> > /gccsrc/gcc/config/arm/arm.c:1194:1: error: uninitialized const member
> > 'vector_cost_table::movi'
> >  };
> >   ^
> > /gccsrc/gcc/config/arm/arm.c:1194:1: warning: missing initializer for
> member
> > 'vector_cost_table::movi' [-Wmissing-field-initializers]
> > /gccsrc/gcc/config/arm/arm.c:1194:1: error: uninitialized const member
> > 'vector_cost_table::dup'
> > /gccsrc/gcc/config/arm/arm.c:1194:1: warning: missing initializer for
> member
> > 'vector_cost_table::dup' [-Wmissing-field-initializers]
> > /gccsrc/gcc/config/arm/arm.c:1194:1: error: uninitialized const member
> > 'vector_cost_table::extract'
> > /gccsrc/gcc/config/arm/arm.c:1194:1: warning: missing initializer for
> member
> > 'vector_cost_table::extract' [-Wmissing-field-initializers]
> >
> > This patch uses the same initialization values as in aarch64 for
> > consistency:
> > +COSTS_N_INSNS (1),  /* movi.  */
> > +COSTS_N_INSNS (2),  /* dup.  */
> > +COSTS_N_INSNS (2)   /* extract.  */
> >
> > But given these fields are not used, maybe a dummy value should be
> > used instead? (zero?)
>
> They're dummy values for now, but there's no reason why the backend
> couldn't be extended to use them in the future.
> Anyway, this patch is okay as is.
>
>
Thanks, pushed as  r12-5132-g1200e211a823816e47a9312efab61a60e12e33e5

Christophe

Thanks,
> Kyrill
>
> >
> > 2021-11-08  Christophe Lyon  
> >
> >   gcc/
> >   * config/arm/arm.c (cortexa9_extra_costs, cortexa8_extra_costs,
> >   cortexa5_extra_costs, cortexa7_extra_costs,
> >   cortexa12_extra_costs, cortexa15_extra_costs, v7m_extra_costs):
> >   Initialize movi, dup and extract costing fields.
> > ---
> >  gcc/config/arm/arm.c | 35 ---
> >  1 file changed, 28 insertions(+), 7 deletions(-)
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index 6c6e77fab66..3f5e1162853 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -1197,7 +1197,10 @@ const struct cpu_cost_table cortexa9_extra_costs
> > =
> >/* Vector */
> >{
> >  COSTS_N_INSNS (1),   /* alu.  */
> > -COSTS_N_INSNS (4)/* mult.  */
> > +COSTS_N_INSNS (4),   /* mult.  */
> > +COSTS_N_INSNS (1),   /* movi.  */
> > +COSTS_N_INSNS (2),   /* dup.  */
> > +COSTS_N_INSNS (2)/* extract.  */
> >}
> >  };
> >
> > @@ -1301,7 +1304,10 @@ const struct cpu_cost_table cortexa8_extra_costs
> > =
> >/* Vector */
> >{
> >  COSTS_N_INSNS (1),   /* alu.  */
> > -COSTS_N_INSNS (4)/* mult.  */
> > +COSTS_N_INSNS (4),   /* mult.  */
> > +COSTS_N_INSNS (1),   /* movi.  */
> > +COSTS_N_INSNS (2),   /* dup.  */
> > +COSTS_N_INSNS (2)/* extract.  */
> >}
> >  };
> >
> > @@ -1406,7 +1412,10 @@ const struct cpu_cost_table cortexa5_extra_costs
> > =
> >/* Vector */
> >{
> >  COSTS_N_INSNS (1),   /* alu.  */
> > -COSTS_N_INSNS (4)/* mult.  */
> > +COSTS_N_INSNS (4),   /* mult.  */
> > +COSTS_N_INSNS (1),   /* movi.  */
> > +COSTS_N_INSNS (2),   /* dup.  */
> > +COSTS_N_INSNS (2)/* extract.  */
> >}
> >  };
> >
> > @@ -1512,7 +1521,10 @@ const struct cpu_cost_table cortexa7_extra_costs
> > =
> >/* Vector */
> >{
> >  COSTS_N_INSNS (1),   /* alu.  */
> > -COSTS_N_INSNS (4)/* mult.  */
> > +COSTS_N_INSNS (4),   /* mult.  */
> > +COSTS_N_INSNS (1),   /* movi.  */
> > +COSTS_N_INSNS (2),   /* dup.  */
> > +COSTS_N_INSNS (2)/* extract.  */
> >}
> >  };
> >
> > @@ -1616,7 +1628,10 @@ const struct cpu_cost_table
> > cortexa12_extra_costs =
> >/* Vector */
> >{
> >  COSTS_N_INSNS (1),   /* alu.  */
> > -COSTS_N_INSNS (4)/* mult.  */
> > +COSTS_N_INSNS (4),   /* mult.  */
> > +COSTS_N_INSNS (1),   /* movi.  */
> > +COSTS_N_INSNS (2),   /* dup.  */
> > +COSTS_N_INSNS (2)/* extract.  */
> >}
> >  };
> >
> > @@ -1720,7 +1735,10 @@ const struct cpu_cost_table
> > cortexa15_extra_costs =
> >/* Vector */
> >{
> >  COSTS_N_INSNS (1),   /* alu.  */
> > -COSTS_N_I

Re: [PATCH] c++: use auto_vec in cp_parser_template_argument_list

2021-11-10 Thread Patrick Palka via Gcc-patches

On Wed, Nov 10, 2021 at 12:16 AM Jason Merrill  wrote:
>
> On 11/9/21 13:42, Patrick Palka wrote:
> > On Tue, 9 Nov 2021, Jason Merrill wrote:
> >
> >> On 11/9/21 11:02, Patrick Palka wrote:
> >>> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> >>
> >> OK, though I wonder about using releasing_vec instead of auto_vec; reusing 
> >> a
> >> previously allocated vec vs. building one on the stack.
> >
> > Thanks a lot.  And hmm, I think using reusing a previously allocated vec
> > here would be tricky since cp_parser_template_argument_list can be
> > called recursively, and so only the outermost call would be able to
> > benefit from reuse unless we perhaps maintain a freelist of such vecs
> > IIUC.
>
> We do maintain such a freelist, in make_tree_vector/release_tree_vector,
> which releasing_vec is a wrapper class for.

Oops sorry, I had totally forgotten about that aspect of
make/release_tree_vector.  Hmm, so given that a template argument list
is very likely to be very small, whereas a vector obtained from the
freelist can have an arbitrary large capacity (depending on what it
stored for its previous users), it seems more overall resourceful to
use a stack-allocated auto_vec here IMHO.

>
> > Seems like that would complicate the code enough to not be worth
> > it.
> >
> >>> gcc/cp/ChangeLog:
> >>>
> >>> * parser.c (cp_parser_template_argument_list): Use auto_vec
> >>> instead of manual memory management.
> >>> ---
> >>>gcc/cp/parser.c | 35 ---
> >>>1 file changed, 8 insertions(+), 27 deletions(-)
> >>>
> >>> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> >>> index 32de97b08bd..8823399529e 100644
> >>> --- a/gcc/cp/parser.c
> >>> +++ b/gcc/cp/parser.c
> >>> @@ -18558,11 +18558,6 @@ cp_parser_template_name (cp_parser* parser,
> >>>static tree
> >>>cp_parser_template_argument_list (cp_parser* parser)
> >>>{
> >>> -  tree fixed_args[10];
> >>> -  unsigned n_args = 0;
> >>> -  unsigned alloced = 10;
> >>> -  tree *arg_ary = fixed_args;
> >>> -  tree vec;
> >>>  bool saved_in_template_argument_list_p;
> >>>  bool saved_ice_p;
> >>>  bool saved_non_ice_p;
> >>> @@ -18581,16 +18576,15 @@ cp_parser_template_argument_list (cp_parser*
> >>> parser)
> >>>  parser->non_integral_constant_expression_p = false;
> >>>/* Parse the arguments.  */
> >>> +  auto_vec args;
> >>>  do
> >>>{
> >>> -  tree argument;
> >>> -
> >>> -  if (n_args)
> >>> +  if (!args.is_empty ())
> >>> /* Consume the comma.  */
> >>> cp_lexer_consume_token (parser->lexer);
> >>>/* Parse the template-argument.  */
> >>> -  argument = cp_parser_template_argument (parser);
> >>> +  tree argument = cp_parser_template_argument (parser);
> >>>/* If the next token is an ellipsis, we're expanding a template
> >>> argument pack. */
> >>> @@ -18610,29 +18604,16 @@ cp_parser_template_argument_list (cp_parser*
> >>> parser)
> >>>  argument = make_pack_expansion (argument);
> >>>}
> >>>-  if (n_args == alloced)
> >>> -   {
> >>> - alloced *= 2;
> >>> -
> >>> - if (arg_ary == fixed_args)
> >>> -   {
> >>> - arg_ary = XNEWVEC (tree, alloced);
> >>> - memcpy (arg_ary, fixed_args, sizeof (tree) * n_args);
> >>> -   }
> >>> - else
> >>> -   arg_ary = XRESIZEVEC (tree, arg_ary, alloced);
> >>> -   }
> >>> -  arg_ary[n_args++] = argument;
> >>> +  args.safe_push (argument);
> >>>}
> >>>  while (cp_lexer_next_token_is (parser->lexer, CPP_COMMA));
> >>>-  vec = make_tree_vec (n_args);
> >>> +  int n_args = args.length ();
> >>> +  tree vec = make_tree_vec (n_args);
> >>>-  while (n_args--)
> >>> -TREE_VEC_ELT (vec, n_args) = arg_ary[n_args];
> >>> +  for (int i = 0; i < n_args; i++)
> >>> +TREE_VEC_ELT (vec, i) = args[i];
> >>>-  if (arg_ary != fixed_args)
> >>> -free (arg_ary);
> >>>  parser->non_integral_constant_expression_p = saved_non_ice_p;
> >>>  parser->integral_constant_expression_p = saved_ice_p;
> >>>  parser->in_template_argument_list_p = 
> >>> saved_in_template_argument_list_p;
> >>>
> >>
> >>
> >
>

Re: [PATCH] vect: Remove vec_outside/inside_cost fields

2021-11-10 Thread Richard Sandiford via Gcc-patches

Martin Liška  writes:
> On 11/8/21 11:43, Richard Sandiford via Gcc-patches wrote:
>> |Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install?|
>
> I think the patch causes the following on x86_64-linux-gnu:
> FAIL: gfortran.dg/inline_matmul_17.f90   -O   scan-tree-dump-times optimized 
> "matmul_r4" 2

I get that failure even with d70ef65692f (from before the patches
I committed today).

Thanks,
Richard

[PATCH] gimple-fold: Transform stp*cpy_chk to strcpy

2021-11-10 Thread Siddhesh Poyarekar

Use the ignore flag to transform BUILT_IN_STPCPY_CHK to BUILT_IN_STRCPY
when set.  This transformation will happen in a subsequent fold anyway
but do it right away and save the additional effort.

gcc/ChangeLog:

* gimple-fold.c (gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk): Use BUILT_IN_STRNCPY if return
value is not used.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/gimple-fold.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 74c9ce4bdc8..cadccfe3010 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -3176,7 +3176,7 @@ gimple_fold_builtin_stxcpy_chk (gimple_stmt_iterator *gsi,
 }
 
   /* If __builtin_st{r,p}cpy_chk is used, assume st{r,p}cpy is available.  */
-  fn = builtin_decl_explicit (fcode == BUILT_IN_STPCPY_CHK
+  fn = builtin_decl_explicit (fcode == BUILT_IN_STPCPY_CHK && !ignore
  ? BUILT_IN_STPCPY : BUILT_IN_STRCPY);
   if (!fn)
 return false;
@@ -3220,7 +3220,7 @@ gimple_fold_builtin_stxncpy_chk (gimple_stmt_iterator 
*gsi,
 return false;
 
   /* If __builtin_st{r,p}ncpy_chk is used, assume st{r,p}ncpy is available.  */
-  fn = builtin_decl_explicit (fcode == BUILT_IN_STPNCPY_CHK
+  fn = builtin_decl_explicit (fcode == BUILT_IN_STPNCPY_CHK && !ignore
  ? BUILT_IN_STPNCPY : BUILT_IN_STRNCPY);
   if (!fn)
 return false;
-- 
2.31.1

Re: Values of WIDE_INT_MAX_ELTS in gcc11 and gcc12 are different

2021-11-10 Thread Qing Zhao via Gcc-patches

Pushed the patch as:

https://gcc.gnu.org/pipermail/gcc-cvs/2021-November/356543.html

Qing


> On Nov 10, 2021, at 2:37 AM, Richard Biener  
> wrote:
> 
> On Tue, Nov 9, 2021 at 6:48 PM Qing Zhao  wrote:
>> 
>> So, based on the discussion so far,  is the following patch good to go?
> 
> OK.
> 
> Thanks,
> Richard.
> 
>> Let me know if you have more comments on the following patch:
>> 
>> (At the same time, I am testing this patch on both x86 and aarch64)
>> 
>> thanks.
>> 
>> Qing
>> 
>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
>> index 0cba95411a6..e8fd16b9c21 100644
>> --- a/gcc/internal-fn.c
>> +++ b/gcc/internal-fn.c
>> @@ -3059,10 +3059,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>>   mark_addressable (lhs);
>>   tree var_addr = build_fold_addr_expr (lhs);
>> 
>> -  tree value = (init_type == AUTO_INIT_PATTERN) ?
>> -   build_int_cst (integer_type_node,
>> -  INIT_PATTERN_VALUE) :
>> -   integer_zero_node;
>> +  tree value = (init_type == AUTO_INIT_PATTERN)
>> +   ? build_int_cst (integer_type_node,
>> +INIT_PATTERN_VALUE)
>> +   : integer_zero_node;
>>   tree m_call = build_call_expr (builtin_decl_implicit (BUILT_IN_MEMSET),
>> 3, var_addr, value, var_size);
>>   /* Expand this memset call.  */
>> @@ -3073,15 +3073,17 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>>   /* If this variable is in a register use expand_assignment.
>> For boolean scalars force zero-init.  */
>>   tree init;
>> +  scalar_int_mode var_mode;
>>   if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE
>>  && tree_fits_uhwi_p (var_size)
>>  && (init_type == AUTO_INIT_PATTERN
>>  || !is_gimple_reg_type (var_type))
>>  && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT,
>> -   0).exists ())
>> +   0).exists (&var_mode)
>> + && have_insn_for (SET, var_mode))
>>{
>>  unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size);
>> - unsigned char *buf = (unsigned char *) xmalloc (total_bytes);
>> + unsigned char *buf = XALLOCAVEC (unsigned char, total_bytes);
>>  memset (buf, (init_type == AUTO_INIT_PATTERN
>>? INIT_PATTERN_VALUE : 0), total_bytes);
>>  tree itype = build_nonstandard_integer_type
>> diff --git a/gcc/testsuite/gcc.target/i386/auto-init-6.c 
>> b/gcc/testsuite/gcc.target/i386/auto-init-6.c
>> index 339f8bc2966..e53385f0eb7 100644
>> --- a/gcc/testsuite/gcc.target/i386/auto-init-6.c
>> +++ b/gcc/testsuite/gcc.target/i386/auto-init-6.c
>> @@ -1,4 +1,6 @@
>> /* Verify pattern initialization for complex type automatic variables.  */
>> +/* Note, _Complex long double is initialized to zeroes due to the current
>> +   implemenation limitation.  */
>> /* { dg-do compile } */
>> /* { dg-options "-ftrivial-auto-var-init=pattern -march=x86-64 
>> -mtune=generic -msse" } */
>> 
>> @@ -15,6 +17,6 @@ _Complex long double foo()
>>   return result;
>> }
>> 
>> -/* { dg-final { scan-assembler-times "long\t-16843010" 10  { target { ! 
>> ia32 } } } } */
>> -/* { dg-final { scan-assembler-times "long\t-16843010" 6  { target { ia32 } 
>> } } } */
>> +/* { dg-final { scan-assembler-times "long\t0" 8  { target { ! ia32 } } } } 
>> */
>> +/* { dg-final { scan-assembler-times "long\t-16843010" 6  } } */
>> 
>> 
>> 
>> 
>>> On Nov 9, 2021, at 4:44 AM, Richard Biener  
>>> wrote:
>>> 
>>> On Tue, Nov 9, 2021 at 10:10 AM Jakub Jelinek  wrote:
 
 On Tue, Nov 09, 2021 at 08:13:57AM +0100, Richard Biener wrote:
>> Hi, I tried both the following patches:
>> 
>> Patch1:
>> 
>> [opc@qinzhao-ol8u3-x86 gcc]$ git diff
>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
>> index 0cba95411a6..ca49d2b4514 100644
>> --- a/gcc/internal-fn.c
>> +++ b/gcc/internal-fn.c
>> @@ -3073,12 +3073,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>>  /* If this variable is in a register use expand_assignment.
>>For boolean scalars force zero-init.  */
>>  tree init;
>> +  scalar_int_mode var_mode;
>>  if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE
>> && tree_fits_uhwi_p (var_size)
>> && (init_type == AUTO_INIT_PATTERN
>> || !is_gimple_reg_type (var_type))
>> && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT,
>> -   0).exists ())
>> +   0).exists (&var_mode)
>> + && targetm.scalar_mode_supported_p (var_mode))
>>   {
>> unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size);
>> unsigned char *buf = (unsigned char *) xmalloc (total_bytes);
>> 
>> AND
>> 
>> Patch2:
>

[PATCH] Allow loop header copying when first iteration condition is known.

2021-11-10 Thread Aldy Hernandez via Gcc-patches

As discussed in the PR, the loop header copying pass avoids doing so
when optimizing for size.  However, sometimes we can determine the
loop entry conditional statically for the first iteration of the loop.

This patch uses the path solver to determine the outgoing edge
out of preheader->header->xx.  If so, it allows header copying.  Doing
this in the loop optimizer saves us from doing gymnastics in the
threader which doesn't have the context to determine if a loop
transformation is profitable.

I am only returning true in entry_loop_condition_is_static for
a true conditional.  Technically a false conditional is also
provably static, but allowing any boolean value causes a regression
in gfortran.dg/vector_subscript_1.f90.

I would have preferred not passing around the query object, but the
layout of pass_ch and should_duplicate_loop_header_p make it a bit
awkward to get it right without an outright refactor to the
pass.

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

PR tree-optimization/102906
* tree-ssa-loop-ch.c (entry_loop_condition_is_static): New.
(should_duplicate_loop_header_p): Call entry_loop_condition_is_static.
(class ch_base): Add m_ranger and m_query.
(ch_base::copy_headers): Pass m_query to
entry_loop_condition_is_static.
(pass_ch::execute): Allocate and deallocate m_ranger and
m_query.
(pass_ch_vect::execute): Same.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr102906.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr102906.c | 17 
 gcc/tree-ssa-loop-ch.c   | 51 
 2 files changed, 60 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102906.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102906.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr102906.c
new file mode 100644
index 000..1846f0b6dba
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102906.c
@@ -0,0 +1,17 @@
+// { dg-do compile }
+// { dg-options "-Os -fdump-tree-ch-details" }
+
+extern unsigned int foo (int*) __attribute__((pure));
+
+unsigned int
+tr2 (int array[], int n)
+{
+  unsigned int sum = 0;
+  int x;
+  if (n > 0)
+for (x = 0; x < n; x++)
+  sum += foo (&array[x]);
+  return sum;
+}
+
+// { dg-final { scan-tree-dump-not "Not duplicating.*optimizing for size" 
"ch2" } }
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index ffb0aa85118..c7d86d751d4 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -35,30 +35,52 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-sccvn.h"
 #include "tree-phinodes.h"
 #include "ssa-iterators.h"
+#include "value-range.h"
+#include "gimple-range.h"
+#include "gimple-range-path.h"
 
 /* Duplicates headers of loops if they are small enough, so that the statements
in the loop body are always executed when the loop is entered.  This
increases effectiveness of code motion optimizations, and reduces the need
for loop preconditioning.  */
 
+/* Return true if the condition on the first iteration of the loop can
+   be statically determined.  */
+
+static bool
+entry_loop_condition_is_static (class loop *l, path_range_query *query)
+{
+  edge e = loop_preheader_edge (l);
+  gcond *last = safe_dyn_cast  (last_stmt (e->dest));
+
+  if (!last
+  || !irange::supports_type_p (TREE_TYPE (gimple_cond_lhs (last
+return false;
+
+  int_range<2> r;
+  query->compute_ranges (e);
+  query->range_of_stmt (r, last);
+  return r == int_range<2> (boolean_true_node, boolean_true_node);
+}
+
 /* Check whether we should duplicate HEADER of LOOP.  At most *LIMIT
instructions should be duplicated, limit is decreased by the actual
amount.  */
 
 static bool
 should_duplicate_loop_header_p (basic_block header, class loop *loop,
-   int *limit)
+   int *limit, path_range_query *query)
 {
   gimple_stmt_iterator bsi;
 
   gcc_assert (!header->aux);
 
-  /* Loop header copying usually increases size of the code.  This used not to
- be true, since quite often it is possible to verify that the condition is
- satisfied in the first iteration and therefore to eliminate it.  Jump
- threading handles these cases now.  */
+  /* Avoid loop header copying when optimizing for size unless we can
+ determine that the loop condition is static in the first
+ iteration.  */
   if (optimize_loop_for_size_p (loop)
-  && !loop->force_vectorize)
+  && !loop->force_vectorize
+  && !entry_loop_condition_is_static (loop, query))
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file,
@@ -267,6 +289,9 @@ class ch_base : public gimple_opt_pass
 
   /* Return true to copy headers of LOOP or false to skip.  */
   virtual bool process_loop_p (class loop *loop) = 0;
+
+  gimple_ranger *m_ranger = NULL;
+  path_range_query *m_query = NULL;
 };
 
 const pass_data pass_data_ch =
@@ -389,7 +414

Re: [PATCH][committed]middle-end: Fix signbit tests when ran on ISA with support for masks.

2021-11-10 Thread Sandra Loosemore


On 11/10/21 4:54 AM, Tamar Christina via Gcc-patches wrote:

Hi All,

These test don't work on vector ISAs where the truth
type don't match the vector mode of the operation.

However I still want the tests to run on these
architectures but just turn off the ISA modes that
enable masks.

This thus turns off SVE is it's on and turns off
AVX512 if it's on.

Regtested on aarch64-none-linux-gnu with SVE on,
and x86_64-pc-linux-gnu with AVX512 on and no
issues.

Committed under the obvious rule.

Thanks,
Tamar

gcc/testsuite/ChangeLog:

* gcc.dg/signbit-2.c: Turn of masks.
* gcc.dg/signbit-5.c: Likewise.


I'm seeing this failure on nios2-elf:

FAIL: gcc.dg/signbit-2.c scan-tree-dump-times optimized "\\s+>\\s+{ 0, 
0, 0, 0 }" 1


I don't understand what it is expecting to happen here.  Should it be 
skipped on this target, or restricted to ARM and x86?  Adding some 
comments to the testcase to explain the significance of that pattern 
might be useful too.


-Sandra

Re: [PATCH][committed]middle-end: Fix signbit tests when ran on ISA with support for masks.

2021-11-10 Thread Tamar Christina via Gcc-patches

FAIL: gcc.dg/signbit-2.c scan-tree-dump-times optimized 
"[file://\\s+]\\s+>\\s+{ 0,
0, 0, 0 }" 1

That's the old test which this patch has changed. Does it still fail with the 
new patch?

From: Sandra Loosemore 
Sent: Wednesday, November 10, 2021 6:37 PM
To: Tamar Christina ; gcc-patches@gcc.gnu.org 

Cc: nd ; rguent...@suse.de 
Subject: Re: [PATCH][committed]middle-end: Fix signbit tests when ran on ISA 
with support for masks.

On 11/10/21 4:54 AM, Tamar Christina via Gcc-patches wrote:
> Hi All,
>
> These test don't work on vector ISAs where the truth
> type don't match the vector mode of the operation.
>
> However I still want the tests to run on these
> architectures but just turn off the ISA modes that
> enable masks.
>
> This thus turns off SVE is it's on and turns off
> AVX512 if it's on.
>
> Regtested on aarch64-none-linux-gnu with SVE on,
> and x86_64-pc-linux-gnu with AVX512 on and no
> issues.
>
> Committed under the obvious rule.
>
> Thanks,
> Tamar
>
> gcc/testsuite/ChangeLog:
>
>* gcc.dg/signbit-2.c: Turn of masks.
>* gcc.dg/signbit-5.c: Likewise.

I'm seeing this failure on nios2-elf:

FAIL: gcc.dg/signbit-2.c scan-tree-dump-times optimized 
"\\s+>\\s+{ 0,
0, 0, 0 }" 1

I don't understand what it is expecting to happen here.  Should it be
skipped on this target, or restricted to ARM and x86?  Adding some
comments to the testcase to explain the significance of that pattern
might be useful too.

-Sandra

1 2 >

1 - 100 of 124 matches

Mail list logo