date:20211130

On Mon, 29 Nov 2021, Martin Sebor wrote:

> On 11/26/21 5:18 AM, Richard Biener via Gcc-patches wrote:
> > This implements a subset of -Wunreachable-code, unreachable code
> > after a return stmt.  Contrary to the previous attemt at CFG
> > construction time this implements the bits during GIMPLE lowering
> > where there are still all GIMPLE return stmts in the IL.
> > 
> > The lowering phase keeps track of whether stmts can fallthru
> > which is used to determine if the following stmt is reachable.
> > The implementation only considers labels here.
> > 
> > The fallthru flag is transparently extended to allow tracking
> > a reason for non-fallthruness which is used to mark returns.
> > 
> > This patch runs in to the same stray return/gcc_unreachable as the
> > previous one and thus requires cleanup across the GCC code base
> > which seems controversical.  So I'm putting this on hold unless
> > I receive some OK for cleanup in any way, meaning this isn't
> > going to make stage3.
> 
> This isn't meant as an objection to the patch per se, just as
> data points suggesting there's room for improvement.  I do think
> at least some of those should be considered for GCC 12 if the patch
> goes in.  I see just one trivial test which seems a bit light.
> I would recommend beefing it up to exercise some the cases below.

definitely

> I tested the patch with Glibc (no warnings) and Binutils/GDB.
> The latter shows over 900 warnings (600 unique ones) in 46
> files, so it might be a useful test bed.  Lots of those, maybe
> most, are for a break after a return, suggesting that it might
> be worthwhile to treat such inoccuous case specially (e.g., only
> warn for a break after a return at level 2).

I've fixed one bug only after submission which caused quite some
false positives with -g -On, failure to skip debug stmts.

> Some other instances suggest other possible improvements.
> For example:
> 
> /src/binutils-gdb/libiberty/lrealpath.c: In function ‘lrealpath’:
> /src/binutils-gdb/libiberty/lrealpath.c:113:3: warning: statement after return
> is not reachable [-Wunreachable-code-return]
>   113 |   {
>   |   ^
> /src/binutils-gdb/libiberty/lrealpath.c:115:5: warning: statement after return
> is not reachable [-Wunreachable-code-return]
>   115 | long path_max = pathconf ("/", _PC_PATH_MAX);
>   | ^~~~
> /src/binutils-gdb/libiberty/lrealpath.c:115:21: warning: statement after
> return is not reachable [-Wunreachable-code-return]
>   115 | long path_max = pathconf ("/", _PC_PATH_MAX);
>   | ^~~~
> 
> I think one of them is a true positive but the others all look
> like noise.  None of the locations is very useful.  It might be
> something to look into.  I would suggest to point the warning
> to the first unreachable statement (other instances point to
> it already) and add a note pointing to the statement that makes
> the former unreachable.
> 
> Another example below shows that the warning triggers more than
> once for the same statement, suggestiung it's missing some
> suppression (e.g., a call to suppress_warning()).
> 
> /src/binutils-gdb/bfd/bfdio.c:167:3: warning: statement after return is not
> reachable [-Wunreachable-code-return]
>   167 |   return close_on_exec (fopen (filename, modes));
>   |   ^~
> /src/binutils-gdb/bfd/bfdio.c:167:10: warning: statement after return is not
> reachable [-Wunreachable-code-return]
>   167 |   return close_on_exec (fopen (filename, modes));
>   |  ^~~
> 
> There are some other "unusual" cases worth a look, such as missing
> context of any kind except for like and column:
> 
> elfnn-riscv.c:3346:7: warning: statement after return is not reachable
> [-Wunreachable-code-return]
> elfnn-riscv.c:3349:7: warning: statement after return is not reachable
> [-Wunreachable-code-return]
> elfnn-riscv.c:3352:7: warning: statement after return is not reachable
> [-Wunreachable-code-return]
> elfnn-riscv.c:3355:7: warning: statement after return is not reachable
> [-Wunreachable-code-return]

Yeah, the patch was meant as explorative prototype (and a way to audit
GCC itself), I didn't pay much attention to things like above, I've
also not yet attempted to record the actual stmt causing the warning.

> I also tried a few test cases of my own that might be worth
> handling at some point (not necessarily in the first iteration).
> 
> struct __jmp_buf_tag { };
> typedef struct __jmp_buf_tag jmp_buf[1];
> 
> void f ();
> 
> void test_return ()
> {
>   return;
>   f ();   // warning here (good)
> }
> 
> extern __attribute__ ((noreturn)) void fnoret ();
> 
> void test_noreturn ()
> {
>   fnoret ();
>   f ();   // missing warning
> }
> 
> void test_throw ()
> {
>   throw "";
>   f ();   // missing warning
> }
> 
> jmp_buf jmpbuf;
> 
> void test_longjmp ()
> {
>   __builtin_longjmp (jmpbuf, 1);
>   f ();   // missing warning
> }

All of the last missing cases above would be -Wu

[PATCH] middle-end/103485 - fix conversion kind for vectors

This makes sure to use a VIEW_CONVERT_EXPR for converting
vector signedness in the -((int)x >> (prec - 1)) to (unsigned)x >> (prec - 1)
simplification.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-11-30  Richard Biener  

PR middle-end/103485
* match.pd (-((int)x >> (prec - 1)) to (unsigned)x >> (prec - 1)):
Use VIEW_CONVERT_EXPR for vectors.

* gcc.dg/pr103485.c: New testcase.
---
 gcc/match.pd|  4 +++-
 gcc/testsuite/gcc.dg/pr103485.c | 10 ++
 2 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103485.c

diff --git a/gcc/match.pd b/gcc/match.pd
index e14f97ee1cd..d467a1c4e45 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1611,7 +1611,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (with { tree stype = TREE_TYPE (@0);
  tree ntype = TYPE_UNSIGNED (stype) ? signed_type_for (stype)
 : unsigned_type_for (stype); }
-   (convert (rshift:ntype (convert:ntype @0) @1)
+   (if (VECTOR_TYPE_P (type))
+(view_convert (rshift (view_convert:ntype @0) @1))
+(convert (rshift (convert:ntype @0) @1))
 
 /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST))
when profitable.
diff --git a/gcc/testsuite/gcc.dg/pr103485.c b/gcc/testsuite/gcc.dg/pr103485.c
new file mode 100644
index 000..1afa9286924
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103485.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+
+int foo_v256u128_0;
+unsigned __attribute__((__vector_size__ (sizeof(unsigned) * 8))) foo_v256u8_0;
+
+void
+foo (void)
+{
+  foo_v256u8_0 -= (foo_v256u8_0 >> sizeof (foo_v256u8_0) - 1) + foo_v256u128_0;
+}
-- 
2.31.1

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

On Mon, Nov 29, 2021 at 11:56 PM Qing Zhao  wrote:
>
> Peter,
>
> Thanks a lot for the patch.
>
> Richard, how do you think of the patch?
>
> (The major concern for me is:
>
> With the current patch proposed by Peter, we will generate the call 
> to .DEFERRED_INIT for a variable with OPAQUE_TYPE during gimplification phase,
>  However, if this variable is in register, then the call to 
> .DEFERRED_INIT will NOT be expanded during RTL expansion phase.  This 
> unexpanded call to .DEFERRED_INIT might cause some potential IR issue later?

I think that's inconsistent indeed.  Peter, what are "opaque"
registers?  rs6000-modes.def suggests
that there's __vector_pair and __vector_quad, what's the GIMPLE types
for those?  It seems they
are either SSA names or expanded to pseudo registers but there's no
constants for them.

>
>  If the above is a real issue, should we skip initialization for all 
> OPAQUE_TYPE variables even when they are in memory and can be initialized 
> with memset?
> then we should update “is_var_need_auto_init” in gimplify.c 
> instead.   However, the issue with this approach is, we might miss the 
> opportunity to initialize an OPAQUE_TYPE variable if it will be in memory?
> ).

I think we need to bite the bullet at some point to do register initialization
not via expand_assignment but directly based on what the LHS expands to.

Can they be initialized?  I see they can be copied at least.

If such "things" cannot be initialized they should indeed be exempt
from auto-init.  The
documentation suggests that they act as bit-bucked but even bit-buckets should
be initializable, thus why exactly does CONST0_RTX not exist for them?

Richard.


>
> Thanks.
>
> Qing
>
>
> > On Nov 29, 2021, at 3:56 PM, Peter Bergner  wrote:
> >
> > Sorry for dropping the ball on testing the patch from the bugzilla!
> >
> > The following patch fixes the ICE reported in the bugzilla on the 
> > pre-existing
> > gcc testsuite test case, bootstraps and shows no testsuite regressions
> > on powerpc64le-linux.  Ok for trunk?
> >
> > Peter
> >
> >
> > For -ftrivial-auto-var-init=*, skip initializing the register variable if it
> > is an opaque type, because CONST0_RTX(mode) is not defined for opaque modes.
> >
> > gcc/
> >   PR middle-end/103127
> >   * internal-fn.c (expand_DEFERRED_INIT): Skip if VAR_TYPE is opaque.
> >
> > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> > index 0cba95411a6..7cc0e9d5293 100644
> > --- a/gcc/internal-fn.c
> > +++ b/gcc/internal-fn.c
> > @@ -3070,6 +3070,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
> > }
> >   else
> > {
> > +  /* Skip variables of opaque types that are in a register.  */
> > +  if (OPAQUE_TYPE_P (var_type))
> > + return;
> > +
> >   /* If this variable is in a register use expand_assignment.
> >For boolean scalars force zero-init.  */
> >   tree init;
>

Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-30 Thread Stephan Bergmann via Gcc-patches


On 15/11/2021 18:28, Marek Polacek via Gcc-patches wrote:

On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:

Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
but changing the name is a trivial operation.


Here's a patch with a better name (suggested by Jonathan W.).  Otherwise no
changes.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
 From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."

More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/

This is not a compiler bug.  However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional characters the preprocessor may encounter.

The default is =unpaired, which warns about improperly terminated
bidirectional characters; e.g. a LRE without its appertaining PDF.  The
level =any warns about any use of bidirectional characters.

This patch handles both UCNs and UTF-8 characters.  UCNs designating
bidi characters in identifiers are accepted since r204886.  Then r217144
enabled -fextended-identifiers by default.  Extended characters in C/C++
identifiers have been accepted since r275979.  However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.


I wonder what the rationale is to warn about UCNs, like in


aText = u"\u202D" + aText;


(as found in the LibreOffice source code).


We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers.  Expectedly, UCNs are ignored
in comments and raw string literals.  The bidirectional characters can nest
so this patch handles that as well.

I have not included nor tested this at all with Fortran (which also has
string literals and line comments).

Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.

PR preprocessor/103026

[...]

[PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

2021-11-30 Thread HAO CHEN GUI via Gcc-patches

Hi,

    This patch modifies the combine pattern with a helper - 
change_pseudo_and_mask when recog fails. The helper converts a single pseudo to 
the pseudo and with a mask if the outer operator is IOR/XOR/PLUS and the inner 
operator is ASHIFT/LSHIFTRT/AND. The conversion helps match shift + ior pattern.

    Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. 
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog

2021-11-30 Haochen Gui 

gcc/
    * combine.c (change_pseudo_and_mask): New.
    (recog_for_combine): If recog fails, try again with the pattern
    modified by change_pseudo_and_mask.

gcc/testsuite/
    * gcc.target/powerpc/20050603-3.c: Modify the dump check conditions.
    * gcc.target/powerpc/rlwimi-2.c: Likewise.

patch.diff

diff --git a/gcc/combine.c b/gcc/combine.c
index 03e9a780919..c83c0aceb57 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11539,6 +11539,42 @@ change_zero_ext (rtx pat)
   return changed;
 }

+/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is
+   ASHIFT/LSHIFTRT/AND, convert a psuedo to psuedo AND with a mask if its
+   nonzero_bits is less than its mode mask.  */
+static bool
+change_pseudo_and_mask (rtx pat)
+{
+  bool changed = false;
+
+  rtx src = SET_SRC (pat);
+  if ((GET_CODE (src) == IOR
+   || GET_CODE (src) == XOR
+   || GET_CODE (src) == PLUS)
+  && (((GET_CODE (XEXP (src, 0)) == ASHIFT
+   || GET_CODE (XEXP (src, 0)) == LSHIFTRT
+   || GET_CODE (XEXP (src, 0)) == AND)
+  && REG_P (XEXP (src, 1)))
+ || ((GET_CODE (XEXP (src, 1)) == ASHIFT
+  || GET_CODE (XEXP (src, 1)) == LSHIFTRT
+  || GET_CODE (XEXP (src, 1)) == AND)
+ && REG_P (XEXP (src, 0)
+    {
+  rtx *reg = REG_P (XEXP (src, 0))
+    ? &XEXP (SET_SRC (pat), 0)
+    : &XEXP (SET_SRC (pat), 1);
+  machine_mode mode = GET_MODE (*reg);
+  unsigned HOST_WIDE_INT nonzero = nonzero_bits (*reg, mode);
+  if (nonzero < GET_MODE_MASK (mode))
+   {
+ rtx x = gen_rtx_AND (mode, *reg, GEN_INT (nonzero));
+ SUBST (*reg, x);
+ changed = true;
+   }
+ }
+  return changed;
+}
+
 /* Like recog, but we receive the address of a pointer to a new pattern.
    We try to match the rtx that the pointer points to.
    If that fails, we may try to modify or replace the pattern,
@@ -11586,7 +11622,14 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx 
*pnotes)
    }
    }
   else
-   changed = change_zero_ext (pat);
+   {
+ if (change_pseudo_and_mask (pat))
+   {
+ maybe_swap_commutative_operands (SET_SRC (pat));
+ changed = true;
+   }
+ changed |= change_zero_ext (pat);
+   }
 }
   else if (GET_CODE (pat) == PARALLEL)
 {
diff --git a/gcc/testsuite/gcc.target/powerpc/20050603-3.c 
b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
index 4017d34f429..e628be11532 100644
--- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
@@ -12,7 +12,7 @@ void rotins (unsigned int x)
   b.y = (x<<12) | (x>>20);
 }

-/* { dg-final { scan-assembler-not {\mrlwinm} } } */
+/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */
 /* { dg-final { scan-assembler-not {\mrldic} } } */
 /* { dg-final { scan-assembler-not {\mrot[lr]} } } */
 /* { dg-final { scan-assembler-not {\ms[lr][wd]} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c 
b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
index bafa371db73..ffb5f9e450f 100644
--- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
@@ -2,14 +2,14 @@
 /* { dg-options "-O2" } */

 /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 14121 { target ilp32 } } 
} */
-/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 20217 { target lp64 } } } 
*/
+/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 21279 { target lp64 } } } 
*/
 /* { dg-final { scan-assembler-times {(?n)^\s+blr} 6750 } } */
 /* { dg-final { scan-assembler-times {(?n)^\s+mr} 643 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {(?n)^\s+mr} 11 { target lp64 } } } */
 /* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 7790 { target lp64 } } } 
*/

 /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } } 
} */
-/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } } 
*/
+/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target lp64 } } } 
*/

 /* { dg-final { scan-assembler-times {(?n)^\s+mulli} 5036 } } */

Re: [PATCH v2] combine: Tweak the condition of last_set invalidation

2021-11-30 Thread Kewen.Lin via Gcc-patches

Hi Segher,

Thanks for the review!

on 2021/11/30 上午6:28, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Jun 11, 2021 at 09:16:21PM +0800, Kewen.Lin wrote:
 +/* Should pick up the lowest luid if the references
 +   are in the same block.  */
 +if (label_tick == rsp->last_set_table_tick
 +&& rsp->last_set_table_luid > insn_luid)
 +  rsp->last_set_table_luid = insn_luid;
>>>
>>> Why?  Is it conservative for the check you will do later?  Please spell
>>> this out, it is crucial!
>>
>> Since later the combinations involving this insn probably make the
>> register be used in one insn sitting ahead (which has smaller luid than
>> the one which was recorded before).  Yes, it's very conservative, this
>> ensure that we always use the luid of the insn which is the first insn
>> using this register in the block.
> 
> Why would that be correct?!
> 

The later check has:

   if (!insn
-  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
+  || (value && rsp->last_set_table_tick >= label_tick_ebb_start
+  && !(label_tick == rsp->last_set_table_tick
+   && DF_INSN_LUID (insn) < rsp->last_set_table_luid)))
 rsp->last_set_invalid = 1;

For "label_tick != rsp->last_set_table_tick", it's the same as before.

For "label_tick == rsp->last_set_table_tick", we have the below:

+  if (label_tick == rsp->last_set_table_tick
+  && rsp->last_set_table_luid > insn_luid)
+rsp->last_set_table_luid = insn_luid;

It keeps checking and updating with the smallest LUID of the insns which
have the expression involving register n are placed in last_set_value.

The updating here aims to ensure we always the LUID of first INSN which
uses register n (or saying that having one expression involving register n
is placed in last_set_value).

For the first time we set last_set_table_tick for register n, we will also
set last_set_table_luid.  For below case, we record x for LUID.  Assuming
we combining 1,2,x to 2,x and regX is updated to be used in insn2.  Then
the first INSN using regX has become to insn 2.

  ... reg1 // insn 1
  ...
  ... reg2 // insn 2
  ...
  ... regX // insn x
  ...
  regX // insn y
  ...

Later whether combining moves regX setting upward or not, the LUID which it
compares with is always the updated smallest one (insn 2 here), not the one
which is set at the beginning.  So I think it's conservative.

>> The last_set invalidation is going
>> to catch the case like:
>>
>>... regX  // avoid the set used here ...
>>regX = ...
>>...
>>
>> Once we have the smallest luid one of all insns which use register X,
>> any unsafe regX sets should be caught.
> 
> Yes, you invalidate more, but because you put lies in the table :-(
> 

This patch tries to relax some restrictions, it seems there are no lies.  :)
Could you help to explain this comment more?

>>  * combine.c (struct reg_stat_type): New member
>>  last_set_table_luid.
> 
> This fits on one line.
> 
>>  (update_table_tick): Add one argument for insn luid and
>>  set last_set_table_luid with it, remove its declaration.
>>  (record_value_for_reg): Adjust the condition to set
>>  last_set_invalid nonzero.
> 
> These lines break earlier than they should as well.
> 
>> +  /* Record the luid of the insn which uses register n, the insn should
>> + be the first one using register n in that block of the insn which
>> + last_set_table_tick was set for.  */
>> +
>> +  int   last_set_table_luid;
> 
> I'm not sure what this variable is for.  The comment says something
> else than the variable name does, and now I don't know what to
> believe :-)
> 
> The name says it is for a SET, the explanation says it is for a USE.
> 

Good point.  :)  For the existing last_set_table_tick,
  /* Record the value of label_tick when an expression involving register n
 is placed in last_set_value.  */

  int   last_set_table_tick;

it seems it has the set in the name too, but "an expression involving
register n " is actually a reference (use) to register n?

How about the below one referring to "last_set_table_tick"?

/* Record the smallest luid of the insns whose expressions involving
   register n are placed in last_set_value, meanwhile the insns locate
   in the same block of the insn which last_set_table_tick was set for.  */

BR,
Kewen

Re: [PATCH] Remove can_throw_non_call_exceptions special case from operator_div::wi_fold.

2021-11-30 Thread Aldy Hernandez via Gcc-patches

On Tue, Nov 30, 2021 at 8:37 AM Richard Biener
 wrote:
>
> On Mon, Nov 29, 2021 at 4:24 PM Aldy Hernandez  wrote:
> >
> > On Mon, Nov 29, 2021 at 3:48 PM Richard Biener
> >  wrote:
> > >
> > > On Mon, Nov 29, 2021 at 3:39 PM Jeff Law  wrote:
> > > >
> > > >
> > > >
> > > > On 11/29/2021 7:00 AM, Aldy Hernandez via Gcc-patches wrote:
> > > > > As discussed in the PR.  The code makes no difference, so whatever 
> > > > > test
> > > > > we added this special case for has been fixed or is being papered 
> > > > > over.
> > > > > I think we should fix any fall out upstream.
> > > > >
> > > > > [Unless Andrew can remember why we added this and it still applies.]
> > > > >
> > > > > Tested on x86-64 Linux.
> > > > >
> > > > > OK for trunk?
> > > > >
> > > > >   PR 103451
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >   * range-op.cc (operator_div::wi_fold): Remove
> > > > >   can_throw_non_call_exceptions special case.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > >   * gcc.dg/pr103451.c: New test.
> > > > I'll defer to Andrew, but it seems wrong to me.  The whole point is to
> > > > set the result to varying so that we don't know the result and never
> > > > remove the division which is critical for -fnon-call-exceptions.
> > >
> > > But that has nothing to do with computing the value range for
> > > the result which is only accessible when the stmt does _not_ throw ...
> > >
> > > That is, if we compute non-VARYING here and because of that
> > > remove the stmt then _that's_ the place to fix (IMO)
> >
> > Ughh, I think you're both right.
> >
> > We should fix this upstream AND we should test for the presence of the
> > division by 0 in the optimized dump.
> >
> > Of course doing both opens a can of worms.  The division by zero can
> > be cleaned up by (at least) DCE, DSE, and the code sinking passes.
> > I've fixed all 3 in the attached (untested) patch.  Dunno what y'all
> > want to do at this point.
>
> I think you need to add -fno-delete-dead-exceptions to the testcase.
> The sinking
> bug looks real, but just
>
>  && (cfun->can_delete_dead_exceptions
> || !stmt_could_throw_p (cfun, stmt))
>
> is needed there.  That change is OK.

Did you mean the entire patch (as attached) is OK, or just the sink part?

Thanks.
Aldy
From e0abd7b05709e41b2e2fda5bde7f5802c6d953ef Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Mon, 29 Nov 2021 12:52:45 +0100
Subject: [PATCH] Remove can_throw_non_call_exceptions special case from
 operator_div::wi_fold.

	PR 103451

gcc/ChangeLog:

	* range-op.cc (operator_div::wi_fold): Remove
	can_throw_non_call_exceptions special case.
	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Check for
	can_throw_non_call_exceptions.
	* tree-ssa-dse.c (pass_dse::execute): Same.
	* tree-ssa-sink.c (sink_code_in_bb): Same.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr103451.c: New test.
---
 gcc/range-op.cc |  7 ---
 gcc/testsuite/gcc.dg/pr103451.c | 19 +++
 gcc/tree-ssa-dce.c  |  3 ++-
 gcc/tree-ssa-dse.c  |  3 ++-
 gcc/tree-ssa-sink.c |  4 +++-
 5 files changed, 26 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103451.c

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index bbf2924f815..6fe5f1cb4e0 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -1832,13 +1832,6 @@ operator_div::wi_fold (irange &r, tree type,
   return;
 }
 
-  // If flag_non_call_exceptions, we must not eliminate a division by zero.
-  if (cfun->can_throw_non_call_exceptions)
-{
-  r.set_varying (type);
-  return;
-}
-
   // If we're definitely dividing by zero, there's nothing to do.
   if (wi_zero_p (type, divisor_min, divisor_max))
 {
diff --git a/gcc/testsuite/gcc.dg/pr103451.c b/gcc/testsuite/gcc.dg/pr103451.c
new file mode 100644
index 000..c701934603e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103451.c
@@ -0,0 +1,19 @@
+// { dg-do compile }
+// { dg-options "-O2 -w -fnon-call-exceptions -fno-delete-dead-exceptions -fdump-tree-optimized" }
+
+int func_10_ptr_12;
+
+void func_10(long li_8) 
+{
+  long *ptr_9 = &li_8;
+  li_8 &= *ptr_9 / 0 ?: li_8;
+  for (;;)
+func_10_ptr_12 &= 4 ? *ptr_9 : 4;
+}
+
+void func_9_s_8() 
+{ 
+  func_10(func_9_s_8); 
+}
+
+// { dg-final { scan-tree-dump " / 0" "optimized" } }
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 1f817b95fab..1c1a5cc0811 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -304,7 +304,8 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
   /* If a statement could throw, it can be deemed necessary unless we
  are allowed to remove dead EH.  Test this after checking for
  new/delete operators since we always elide their EH.  */
-  if (!cfun->can_delete_dead_exceptions
+  if ((!cfun->can_delete_dead_exceptions
+   || cfun->can_throw_non_call_exceptions)
   && stmt_could_throw_p (cfun, stmt))
 {
   mark_stmt

[PATCH] libstdc++: Add [[nodiscard]] to std::byteswap

Hi!

This patch adds [[nodiscard]] to std::byteswap, because the function
template doesn't do anything useful if the result isn't used.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-11-30  Jakub Jelinek  

* include/std/bit (byteswap): Add [[nodiscard]].

--- libstdc++-v3/include/std/bit.jj 2021-11-28 16:32:15.204524854 +0100
+++ libstdc++-v3/include/std/bit2021-11-29 17:40:00.781074520 +0100
@@ -83,6 +83,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// Reverse order of bytes in the object representation of `value`.
   template
+[[nodiscard]]
 constexpr enable_if_t::value, _Tp>
 byteswap(_Tp __value) noexcept
 {

Jakub

Re: [PATCH] Remove can_throw_non_call_exceptions special case from operator_div::wi_fold.

On Tue, Nov 30, 2021 at 9:51 AM Aldy Hernandez  wrote:
>
> On Tue, Nov 30, 2021 at 8:37 AM Richard Biener
>  wrote:
> >
> > On Mon, Nov 29, 2021 at 4:24 PM Aldy Hernandez  wrote:
> > >
> > > On Mon, Nov 29, 2021 at 3:48 PM Richard Biener
> > >  wrote:
> > > >
> > > > On Mon, Nov 29, 2021 at 3:39 PM Jeff Law  wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 11/29/2021 7:00 AM, Aldy Hernandez via Gcc-patches wrote:
> > > > > > As discussed in the PR.  The code makes no difference, so whatever 
> > > > > > test
> > > > > > we added this special case for has been fixed or is being papered 
> > > > > > over.
> > > > > > I think we should fix any fall out upstream.
> > > > > >
> > > > > > [Unless Andrew can remember why we added this and it still applies.]
> > > > > >
> > > > > > Tested on x86-64 Linux.
> > > > > >
> > > > > > OK for trunk?
> > > > > >
> > > > > >   PR 103451
> > > > > >
> > > > > > gcc/ChangeLog:
> > > > > >
> > > > > >   * range-op.cc (operator_div::wi_fold): Remove
> > > > > >   can_throw_non_call_exceptions special case.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >
> > > > > >   * gcc.dg/pr103451.c: New test.
> > > > > I'll defer to Andrew, but it seems wrong to me.  The whole point is to
> > > > > set the result to varying so that we don't know the result and never
> > > > > remove the division which is critical for -fnon-call-exceptions.
> > > >
> > > > But that has nothing to do with computing the value range for
> > > > the result which is only accessible when the stmt does _not_ throw ...
> > > >
> > > > That is, if we compute non-VARYING here and because of that
> > > > remove the stmt then _that's_ the place to fix (IMO)
> > >
> > > Ughh, I think you're both right.
> > >
> > > We should fix this upstream AND we should test for the presence of the
> > > division by 0 in the optimized dump.
> > >
> > > Of course doing both opens a can of worms.  The division by zero can
> > > be cleaned up by (at least) DCE, DSE, and the code sinking passes.
> > > I've fixed all 3 in the attached (untested) patch.  Dunno what y'all
> > > want to do at this point.
> >
> > I think you need to add -fno-delete-dead-exceptions to the testcase.
> > The sinking
> > bug looks real, but just
> >
> >  && (cfun->can_delete_dead_exceptions
> > || !stmt_could_throw_p (cfun, stmt))
> >
> > is needed there.  That change is OK.
>
> Did you mean the entire patch (as attached) is OK, or just the sink part?

The DCE and DSE parts are wrong and not needed.  The remaining pieces
are OK.

Thanks,
Richard.

> Thanks.
> Aldy

Re: [PATCH] Remove can_throw_non_call_exceptions special case from operator_div::wi_fold.

2021-11-30 Thread Aldy Hernandez via Gcc-patches

Will adjust, re-test and commit.

Thanks.
Aldy

On Tue, Nov 30, 2021 at 10:00 AM Richard Biener
 wrote:
>
> On Tue, Nov 30, 2021 at 9:51 AM Aldy Hernandez  wrote:
> >
> > On Tue, Nov 30, 2021 at 8:37 AM Richard Biener
> >  wrote:
> > >
> > > On Mon, Nov 29, 2021 at 4:24 PM Aldy Hernandez  wrote:
> > > >
> > > > On Mon, Nov 29, 2021 at 3:48 PM Richard Biener
> > > >  wrote:
> > > > >
> > > > > On Mon, Nov 29, 2021 at 3:39 PM Jeff Law  
> > > > > wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 11/29/2021 7:00 AM, Aldy Hernandez via Gcc-patches wrote:
> > > > > > > As discussed in the PR.  The code makes no difference, so 
> > > > > > > whatever test
> > > > > > > we added this special case for has been fixed or is being papered 
> > > > > > > over.
> > > > > > > I think we should fix any fall out upstream.
> > > > > > >
> > > > > > > [Unless Andrew can remember why we added this and it still 
> > > > > > > applies.]
> > > > > > >
> > > > > > > Tested on x86-64 Linux.
> > > > > > >
> > > > > > > OK for trunk?
> > > > > > >
> > > > > > >   PR 103451
> > > > > > >
> > > > > > > gcc/ChangeLog:
> > > > > > >
> > > > > > >   * range-op.cc (operator_div::wi_fold): Remove
> > > > > > >   can_throw_non_call_exceptions special case.
> > > > > > >
> > > > > > > gcc/testsuite/ChangeLog:
> > > > > > >
> > > > > > >   * gcc.dg/pr103451.c: New test.
> > > > > > I'll defer to Andrew, but it seems wrong to me.  The whole point is 
> > > > > > to
> > > > > > set the result to varying so that we don't know the result and never
> > > > > > remove the division which is critical for -fnon-call-exceptions.
> > > > >
> > > > > But that has nothing to do with computing the value range for
> > > > > the result which is only accessible when the stmt does _not_ throw ...
> > > > >
> > > > > That is, if we compute non-VARYING here and because of that
> > > > > remove the stmt then _that's_ the place to fix (IMO)
> > > >
> > > > Ughh, I think you're both right.
> > > >
> > > > We should fix this upstream AND we should test for the presence of the
> > > > division by 0 in the optimized dump.
> > > >
> > > > Of course doing both opens a can of worms.  The division by zero can
> > > > be cleaned up by (at least) DCE, DSE, and the code sinking passes.
> > > > I've fixed all 3 in the attached (untested) patch.  Dunno what y'all
> > > > want to do at this point.
> > >
> > > I think you need to add -fno-delete-dead-exceptions to the testcase.
> > > The sinking
> > > bug looks real, but just
> > >
> > >  && (cfun->can_delete_dead_exceptions
> > > || !stmt_could_throw_p (cfun, stmt))
> > >
> > > is needed there.  That change is OK.
> >
> > Did you mean the entire patch (as attached) is OK, or just the sink part?
>
> The DCE and DSE parts are wrong and not needed.  The remaining pieces
> are OK.
>
> Thanks,
> Richard.
>
> > Thanks.
> > Aldy
>

Fix PR target/103274

2021-11-30 Thread Eric Botcazou via Gcc-patches

This fixes a thinko in my fix for the -freorder-blocks-and-partition glitch 
with SEH on 64-bit Windows:
  https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565208.html

Even if no exceptions are active, e.g. in C, we always need to consider calls.

Tested on x86-64/Windows, applied on mainline, 11 and 10 branches as obvious.


2021-11-30  Eric Botcazou 

PR target/103274
* config/i386/i386.c (ix86_output_call_insn): Beef up comment about
nops emitted with SEH.
* config/i386/winnt.c (i386_pe_seh_unwind_emit): When switching to
the cold section, emit a nop before the directive if the previous
active instruction is a call.

-- 
Eric Botcazoudiff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 2657e7817ae..0e6bf3e0fef 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -16438,8 +16438,10 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op)
 	break;
 
 	  /* If we get to the epilogue note, prevent a catch region from
-	 being adjacent to the standard epilogue sequence.  If non-
-	 call-exceptions, we'll have done this during epilogue emission. */
+	 being adjacent to the standard epilogue sequence.  Note that,
+	 if non-call exceptions are enabled, we already did it during
+	 epilogue expansion, or else, if the insn can throw internally,
+	 we already did it during the reorg pass.  */
 	  if (NOTE_P (i) && NOTE_KIND (i) == NOTE_INSN_EPILOGUE_BEG
 	  && !flag_non_call_exceptions
 	  && !can_throw_internal (insn))
diff --git a/gcc/config/i386/winnt.c b/gcc/config/i386/winnt.c
index 7c0ea4f731c..0aaf46f050a 100644
--- a/gcc/config/i386/winnt.c
+++ b/gcc/config/i386/winnt.c
@@ -1243,9 +1243,9 @@ i386_pe_seh_unwind_emit (FILE *out_file, rtx_insn *insn)
   seh = cfun->machine->seh;
   if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS)
 {
-  /* See ix86_seh_fixup_eh_fallthru for the rationale.  */
+  /* See ix86_output_call_insn/seh_fixup_eh_fallthru for the rationale.  */
   rtx_insn *prev = prev_active_insn (insn);
-  if (prev && !insn_nothrow_p (prev))
+  if (prev && (CALL_P (prev) || !insn_nothrow_p (prev)))
 	fputs ("\tnop\n", out_file);
   fputs ("\t.seh_endproc\n", out_file);
   seh->in_cold_section = true;

[PATCH] simplify-rtx: Punt on simplify_associative_operation with large operands [PR102356]

Hi!

Seems simplify_associate_operation is quadratic, which isn't a big deal
for use during combine and other similar RTL passes, because those never
try to combine expressions from more than a few instructions and because
those instructions need to be recognized the machine description also bounds
how many expressions can appear in there.
var-tracking has depth limits only for some cases and unlimited depth
for the vt_expand_loc though:
/* This is the value used during expansion of locations.  We want it
   to be unbounded, so that variables expanded deep in a recursion
   nest are fully evaluated, so that their values are cached
   correctly.  We avoid recursion cycles through other means, and we
   don't unshare RTL, so excess complexity is not a problem.  */
#define EXPR_DEPTH (INT_MAX)
/* We use this to keep too-complex expressions from being emitted as
   location notes, and then to debug information.  Users can trade
   compile time for ridiculously complex expressions, although they're
   seldom useful, and they may often have to be discarded as not
   representable anyway.  */
#define EXPR_USE_DEPTH (param_max_vartrack_expr_depth)

IMO for very large expressions it isn't worth trying to reassociate though,
in fact e.g. for the new testcase below keeping it as is has bigger chance
of generating smaller debug info which the dwarf2out.c part of the change
tries to achieve - if a binary operation has the same operands, we can
use DW_OP_dup and not bother computing the possibly large operand again.

This patch punts if the associate operands contain together more than
64 same operations, which can happen only during var-tracking.
During bootstrap/regtest on x86_64-linux and i686-linux, this triggers
only on the new testcase and on gcc.dg/torture/pr88597.c.
I think given the 16 element static buffer in subrtx_iterator::array_type
it shouldn't slow down the common case of small expressions, but have
been wondering whether we shouldn't have some in_vartrack global flag
or guard it with
(current_pass && strcmp (current_pass->name, "vartrack") == 0).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Another possibility to deal with the power expressions in debug info
would be to introduce some new RTL operation for the pow{,i} (x, n)
case, allow that solely in debug insns and expand those into DWARF
using a loop.  But that seems like quite a lot of work for something rarely
used (especially when powi for larger n is only useful for 0 and 1 inputs
because anything else overflows).

2021-11-30  Jakub Jelinek  

PR rtl-optimization/102356
* simplify-rtx.c: Include rtl-iter.h.
(simplify_associative_operation): Don't reassociate very large
expressions with 64 or more CODE subrtxes in the operands.
* dwarf2out.c (mem_loc_descriptor): Optimize binary operation
with both operands the same using DW_OP_dup.

* gcc.dg/pr102356.c: New test.

--- gcc/simplify-rtx.c.jj   2021-11-05 00:43:22.576624649 +0100
+++ gcc/simplify-rtx.c  2021-11-29 19:46:29.674750656 +0100
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.
 #include "selftest-rtl.h"
 #include "rtx-vector-builder.h"
 #include "rtlanal.h"
+#include "rtl-iter.h"
 
 /* Simplification and canonicalization of RTL.  */
 
@@ -2263,9 +2264,40 @@ simplify_context::simplify_associative_o
 {
   rtx tem;
 
+  if (GET_CODE (op0) == code || GET_CODE (op1) == code)
+{
+  /* During vartrack, the expressions can grow arbitrarily large.
+Reassociation isn't really useful for larger expressions
+and can be very compile time expensive.  */
+  unsigned count = 0;
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, op0, NONCONST)
+   {
+ const_rtx x = *iter;
+ if (GET_CODE (x) == code)
+   {
+ if (count++ >= 64)
+   return NULL_RTX;
+   }
+ else
+   iter.skip_subrtxes ();
+   }
+  FOR_EACH_SUBRTX (iter, array, op1, NONCONST)
+   {
+ const_rtx x = *iter;
+ if (GET_CODE (x) == code)
+   {
+ if (count++ >= 64)
+   return NULL_RTX;
+   }
+ else
+   iter.skip_subrtxes ();
+   }
+}
+
   /* Linearize the operator to the left.  */
   if (GET_CODE (op1) == code)
 {
   /* "(a op b) op (c op d)" becomes "((a op b) op c) op d)".  */
   if (GET_CODE (op0) == code)
{
--- gcc/dwarf2out.c.jj  2021-11-29 14:24:14.053634713 +0100
+++ gcc/dwarf2out.c 2021-11-29 19:44:54.192113401 +0100
@@ -16363,6 +16363,15 @@ mem_loc_descriptor (rtx rtl, machine_mod
 do_binop:
   op0 = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode,
VAR_INIT_STATUS_INITIALIZED);
+  if (XEXP (rtl, 0) == XEXP (rtl, 1))
+   {
+ if (op0 == 0)
+   break;
+ mem_loc_result = op0;
+ add_loc_descr (&mem_loc_result, new_loc_descr (DW_OP

Re: [PATCH] Fix --help -Q output

On Mon, Nov 29, 2021 at 4:16 PM Martin Liška  wrote:
>
> There are cases where a default option value is -1 and
> auto-detection happens in e.g. target.
>
> Do not print these options. Leads to the following diff:
>
> -  -fdelete-null-pointer-checks [enabled]
> +  -fdelete-null-pointer-checks
> @@ -332 +332 @@
> -  -fleading-underscore [enabled]
> +  -fleading-underscore
> @@ -393 +393 @@
> -  -fprefetch-loop-arrays   [enabled]
> +  -fprefetch-loop-arrays
> @@ -502 +502 @@
> -  -fstrict-volatile-bitfields  [enabled]
> +  -fstrict-volatile-bitfields
> @@ -533 +533 @@
> -  -ftree-loop-if-convert   [enabled]
> +  -ftree-loop-if-convert
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> PR middle-end/103438
>
> gcc/ChangeLog:
>
> * opts-common.c (option_enabled): Return flag_var for BOOLEAN
> types (the can contain an unknown value -1).
> ---
>   gcc/opts-common.c | 7 ---
>   1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/opts-common.c b/gcc/opts-common.c
> index 9d1914ff2ff..c4a19b9a0b6 100644
> --- a/gcc/opts-common.c
> +++ b/gcc/opts-common.c
> @@ -1586,7 +1586,8 @@ option_flag_var (int opt_index, struct gcc_options 
> *opts)
>   }
>
>   /* Return 1 if option OPT_IDX is enabled in OPTS, 0 if it is disabled,
> -   or -1 if it isn't a simple on-off switch.  */
> +   or -1 if it isn't a simple on-off switch (or if the value is unknown,
> +   typically set later in target).  */
>
>   int
>   option_enabled (int opt_idx, unsigned lang_mask, void *opts)
> @@ -1608,9 +1609,9 @@ option_enabled (int opt_idx, unsigned lang_mask, void 
> *opts)
> {
> case CLVC_BOOLEAN:
> if (option->cl_host_wide_int)
> - return *(HOST_WIDE_INT *) flag_var != 0;
> + return *(HOST_WIDE_INT *) flag_var;
> else
> - return *(int *) flag_var != 0;
> + return *(int *) flag_var;

So can we assert that only 1, 0 and -1 are returned here?  For example I see
(random picked)

 /* [64] = */ {
"--param=align-threshold=",
"Select fraction of the maximal frequency of executions of basic
block in function given basic block get alignment.",
NULL,
NULL,
NULL, NULL, N_OPTS, N_OPTS, 23, /* .neg_idx = */ -1,
CL_COMMON | CL_JOINED | CL_OPTIMIZATION | CL_PARAMS,
0, 0, 0, 0, 0, 0, 0, 0, 1 /* UInteger */, 0, 0, 0,
offsetof (struct gcc_options, x_param_align_threshold), 0,
CLVC_BOOLEAN, 0, 1, 65536 },

or

 /* [877] = */ {
"-faligned-new=",
"-faligned-new=  Use C++17 over-aligned type allocation for
alignments greater than N.",
NULL,
NULL,
NULL, NULL, N_OPTS, N_OPTS, 13, /* .neg_idx = */ -1,
CL_CXX | CL_ObjCXX | CL_JOINED,
0, 0, 0, 0, 0, 0, 1 /* RejectNegative */, 0, 1 /* UInteger */, 0, 0, 0,
offsetof (struct gcc_options, x_aligned_new_threshold), 0,
CLVC_BOOLEAN, 0, -1, -1 },

and the "docs" say

  /* The switch is enabled when FLAG_VAR is nonzero.  */
  CLVC_BOOLEAN,

so a -1 init contradicts this.  I also wonder what determines the
CLVC_* kind, it seems
it's simply the "default" kind chosen when nothing else matches in var_set().

>
> case CLVC_EQUAL:
> if (option->cl_host_wide_int)
> --
> 2.34.0
>

Re: [PATCH] simplify-rtx: Punt on simplify_associative_operation with large operands [PR102356]

On Tue, 30 Nov 2021, Jakub Jelinek wrote:

> Hi!
> 
> Seems simplify_associate_operation is quadratic, which isn't a big deal
> for use during combine and other similar RTL passes, because those never
> try to combine expressions from more than a few instructions and because
> those instructions need to be recognized the machine description also bounds
> how many expressions can appear in there.

Well ...

> var-tracking has depth limits only for some cases and unlimited depth
> for the vt_expand_loc though:
> /* This is the value used during expansion of locations.  We want it
>to be unbounded, so that variables expanded deep in a recursion
>nest are fully evaluated, so that their values are cached
>correctly.  We avoid recursion cycles through other means, and we
>don't unshare RTL, so excess complexity is not a problem.  */
> #define EXPR_DEPTH (INT_MAX)
> /* We use this to keep too-complex expressions from being emitted as
>location notes, and then to debug information.  Users can trade
>compile time for ridiculously complex expressions, although they're
>seldom useful, and they may often have to be discarded as not
>representable anyway.  */
> #define EXPR_USE_DEPTH (param_max_vartrack_expr_depth)
> 
> IMO for very large expressions it isn't worth trying to reassociate though,
> in fact e.g. for the new testcase below keeping it as is has bigger chance
> of generating smaller debug info which the dwarf2out.c part of the change
> tries to achieve - if a binary operation has the same operands, we can
> use DW_OP_dup and not bother computing the possibly large operand again.
> 
> This patch punts if the associate operands contain together more than
> 64 same operations, which can happen only during var-tracking.
> During bootstrap/regtest on x86_64-linux and i686-linux, this triggers
> only on the new testcase and on gcc.dg/torture/pr88597.c.
> I think given the 16 element static buffer in subrtx_iterator::array_type
> it shouldn't slow down the common case of small expressions, but have
> been wondering whether we shouldn't have some in_vartrack global flag
> or guard it with
> (current_pass && strcmp (current_pass->name, "vartrack") == 0).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Another possibility to deal with the power expressions in debug info
> would be to introduce some new RTL operation for the pow{,i} (x, n)
> case, allow that solely in debug insns and expand those into DWARF
> using a loop.  But that seems like quite a lot of work for something rarely
> used (especially when powi for larger n is only useful for 0 and 1 inputs
> because anything else overflows).

I wonder given we now have 'simplify_context' whether we can
track a re-association budget we can eat from.  At least your
code to determine whether the expression is too large is
quadratic as well (but bound to 64, so just a very large constant
overhead for an outermost expression of size 63).  We already
have a mem_depth there, so just have reassoc_times and punt
if that reaches --param max-simplify-reassoc-times, incrementing
it each time simplify_associative_operation is entered?

Thanks,
Richard.

> 2021-11-30  Jakub Jelinek  
> 
>   PR rtl-optimization/102356
>   * simplify-rtx.c: Include rtl-iter.h.
>   (simplify_associative_operation): Don't reassociate very large
>   expressions with 64 or more CODE subrtxes in the operands.
>   * dwarf2out.c (mem_loc_descriptor): Optimize binary operation
>   with both operands the same using DW_OP_dup.
> 
>   * gcc.dg/pr102356.c: New test.
> 
> --- gcc/simplify-rtx.c.jj 2021-11-05 00:43:22.576624649 +0100
> +++ gcc/simplify-rtx.c2021-11-29 19:46:29.674750656 +0100
> @@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.
>  #include "selftest-rtl.h"
>  #include "rtx-vector-builder.h"
>  #include "rtlanal.h"
> +#include "rtl-iter.h"
>  
>  /* Simplification and canonicalization of RTL.  */
>  
> @@ -2263,9 +2264,40 @@ simplify_context::simplify_associative_o
>  {
>rtx tem;
>  
> +  if (GET_CODE (op0) == code || GET_CODE (op1) == code)
> +{
> +  /* During vartrack, the expressions can grow arbitrarily large.
> +  Reassociation isn't really useful for larger expressions
> +  and can be very compile time expensive.  */
> +  unsigned count = 0;
> +  subrtx_iterator::array_type array;
> +  FOR_EACH_SUBRTX (iter, array, op0, NONCONST)
> + {
> +   const_rtx x = *iter;
> +   if (GET_CODE (x) == code)
> + {
> +   if (count++ >= 64)
> + return NULL_RTX;
> + }
> +   else
> + iter.skip_subrtxes ();
> + }
> +  FOR_EACH_SUBRTX (iter, array, op1, NONCONST)
> + {
> +   const_rtx x = *iter;
> +   if (GET_CODE (x) == code)
> + {
> +   if (count++ >= 64)
> + return NULL_RTX;
> + }
> +   else
> + iter.skip_subrtxes ();
> + }
> +}
> +
>

[PATCH] [i386] Fix ICE in ix86_attr_length_immediate_default.

2021-11-30 Thread liuhongt via Gcc-patches

ix86_attr_length_immediate_default assume TYPE ishift only have 1
constant operand,
but *x86_64_shld_1/*x86_shld_1/*x86_64_shrd_1/*x86_shrd_1 has 2, with
condition: INTVAL (operands[3]) == 32 - INTVAL (operands[2]) or
INTVAL (operands[3]) == 64 - INTVAL (operands[2]), and hit
gcc_assert.
Explicitly set_attr length_immediate for these patterns.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?


gcc/ChangeLog:

PR target/103463
PR target/103484
* config/i386/i386.md (*x86_64_shld_1): Set_attr
length_immediate to 1.
(*x86_shld_1): Ditto.
(*x86_64_shrd_1): Ditto.
(*x86_shrd_1): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr103463.c: New test.
* gcc.target/i386/pr103463-2.c: New test.
---
 gcc/config/i386/i386.md|  4 
 gcc/testsuite/gcc.target/i386/pr103463-2.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr103463.c   | 13 +
 3 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103463-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103463.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index c88374c9d2b..4e9fae80479 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -11512,6 +11512,7 @@ (define_insn "*x86_64_shld_1"
   [(set_attr "type" "ishift")
(set_attr "prefix_0f" "1")
(set_attr "mode" "DI")
+   (set_attr "length_immediate" "1")
(set_attr "athlon_decode" "vector")
(set_attr "amdfam10_decode" "vector")
(set_attr "bdver1_decode" "vector")])
@@ -11573,6 +11574,7 @@ (define_insn "*x86_shld_1"
   "shld{l}\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ishift")
(set_attr "prefix_0f" "1")
+   (set_attr "length_immediate" "1")
(set_attr "mode" "SI")
(set_attr "pent_pair" "np")
(set_attr "athlon_decode" "vector")
@@ -12384,6 +12386,7 @@ (define_insn "*x86_64_shrd_1"
   "shrd{q}\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ishift")
(set_attr "prefix_0f" "1")
+   (set_attr "length_immediate" "1")
(set_attr "mode" "DI")
(set_attr "athlon_decode" "vector")
(set_attr "amdfam10_decode" "vector")
@@ -12446,6 +12449,7 @@ (define_insn "*x86_shrd_1"
   "shrd{l}\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ishift")
(set_attr "prefix_0f" "1")
+   (set_attr "length_immediate" "1")
(set_attr "mode" "SI")
(set_attr "pent_pair" "np")
(set_attr "athlon_decode" "vector")
diff --git a/gcc/testsuite/gcc.target/i386/pr103463-2.c 
b/gcc/testsuite/gcc.target/i386/pr103463-2.c
new file mode 100644
index 000..9c29b70bbd8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103463-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -fno-tree-bit-ccp" } */
+
+int foo_u64_1;
+unsigned __int128 foo_u128_1;
+
+void
+foo (void)
+{
+  foo_u128_1 <<= 127;
+  foo_u64_1 += __builtin_sub_overflow_p (0, (long) foo_u128_1, 0);
+  foo_u128_1 =
+foo_u128_1 >> (foo_u128_1 & 127) | foo_u128_1 << (-foo_u128_1 & 127);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr103463.c 
b/gcc/testsuite/gcc.target/i386/pr103463.c
new file mode 100644
index 000..faae9a858e5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103463.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-Os -fno-tree-dominator-opts -fno-tree-vrp" } */
+
+int bar0_u8_0, bar0_u16_0, bar0_u32_0, bar0_u16_1, bar0_u32_1;
+unsigned __int128 bar0_u128_0;
+
+int
+bar0() {
+  bar0_u16_1 *=
+  __builtin_add_overflow_p(bar0_u16_0, bar0_u32_1, (long)bar0_u8_0);
+  bar0_u128_0 = bar0_u128_0 >> bar0_u16_1 | bar0_u128_0 << (-bar0_u16_1 & 127);
+  bar0_u128_0 += __builtin_mul_overflow_p(bar0_u32_0, 20, 0);
+}
-- 
2.18.1

Re: [PATCH] [i386] Fix ICE in ix86_attr_length_immediate_default.

2021-11-30 Thread Hongtao Liu via Gcc-patches

On Tue, Nov 30, 2021 at 5:44 PM liuhongt via Gcc-patches
 wrote:
>
> ix86_attr_length_immediate_default assume TYPE ishift only have 1
> constant operand,
> but *x86_64_shld_1/*x86_shld_1/*x86_64_shrd_1/*x86_shrd_1 has 2, with
> condition: INTVAL (operands[3]) == 32 - INTVAL (operands[2]) or
> INTVAL (operands[3]) == 64 - INTVAL (operands[2]), and hit
> gcc_assert.
> Explicitly set_attr length_immediate for these patterns.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
>
> gcc/ChangeLog:
>
> PR target/103463
> PR target/103484
> * config/i386/i386.md (*x86_64_shld_1): Set_attr
> length_immediate to 1.
> (*x86_shld_1): Ditto.
> (*x86_64_shrd_1): Ditto.
> (*x86_shrd_1): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr103463.c: New test.
> * gcc.target/i386/pr103463-2.c: New test.
> ---
>  gcc/config/i386/i386.md|  4 
>  gcc/testsuite/gcc.target/i386/pr103463-2.c | 14 ++
>  gcc/testsuite/gcc.target/i386/pr103463.c   | 13 +
>  3 files changed, 31 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr103463-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr103463.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index c88374c9d2b..4e9fae80479 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -11512,6 +11512,7 @@ (define_insn "*x86_64_shld_1"
>[(set_attr "type" "ishift")
> (set_attr "prefix_0f" "1")
> (set_attr "mode" "DI")
> +   (set_attr "length_immediate" "1")
> (set_attr "athlon_decode" "vector")
> (set_attr "amdfam10_decode" "vector")
> (set_attr "bdver1_decode" "vector")])
> @@ -11573,6 +11574,7 @@ (define_insn "*x86_shld_1"
>"shld{l}\t{%2, %1, %0|%0, %1, %2}"
>[(set_attr "type" "ishift")
> (set_attr "prefix_0f" "1")
> +   (set_attr "length_immediate" "1")
> (set_attr "mode" "SI")
> (set_attr "pent_pair" "np")
> (set_attr "athlon_decode" "vector")
> @@ -12384,6 +12386,7 @@ (define_insn "*x86_64_shrd_1"
>"shrd{q}\t{%2, %1, %0|%0, %1, %2}"
>[(set_attr "type" "ishift")
> (set_attr "prefix_0f" "1")
> +   (set_attr "length_immediate" "1")
> (set_attr "mode" "DI")
> (set_attr "athlon_decode" "vector")
> (set_attr "amdfam10_decode" "vector")
> @@ -12446,6 +12449,7 @@ (define_insn "*x86_shrd_1"
>"shrd{l}\t{%2, %1, %0|%0, %1, %2}"
>[(set_attr "type" "ishift")
> (set_attr "prefix_0f" "1")
> +   (set_attr "length_immediate" "1")
> (set_attr "mode" "SI")
> (set_attr "pent_pair" "np")
> (set_attr "athlon_decode" "vector")
> diff --git a/gcc/testsuite/gcc.target/i386/pr103463-2.c 
> b/gcc/testsuite/gcc.target/i386/pr103463-2.c
> new file mode 100644
> index 000..9c29b70bbd8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr103463-2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -fno-tree-bit-ccp" } */
> +
> +int foo_u64_1;
> +unsigned __int128 foo_u128_1;
> +
> +void
> +foo (void)
> +{
> +  foo_u128_1 <<= 127;
> +  foo_u64_1 += __builtin_sub_overflow_p (0, (long) foo_u128_1, 0);
> +  foo_u128_1 =
> +foo_u128_1 >> (foo_u128_1 & 127) | foo_u128_1 << (-foo_u128_1 & 127);
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr103463.c 
> b/gcc/testsuite/gcc.target/i386/pr103463.c
> new file mode 100644
> index 000..faae9a858e5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr103463.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-Os -fno-tree-dominator-opts -fno-tree-vrp" } */
> +
> +int bar0_u8_0, bar0_u16_0, bar0_u32_0, bar0_u16_1, bar0_u32_1;
> +unsigned __int128 bar0_u128_0;
> +
> +int
> +bar0() {
> +  bar0_u16_1 *=
> +  __builtin_add_overflow_p(bar0_u16_0, bar0_u32_1, (long)bar0_u8_0);
> +  bar0_u128_0 = bar0_u128_0 >> bar0_u16_1 | bar0_u128_0 << (-bar0_u16_1 & 
> 127);
> +  bar0_u128_0 += __builtin_mul_overflow_p(bar0_u32_0, 20, 0);
> +}
> --
> 2.18.1
>


-- 
BR,
Hongtao

Re: [PATCH] Final value replacement improvements for until-wrap loops.

On Mon, Nov 29, 2021 at 10:07 AM Roger Sayle  wrote:
>
>
> This middle-end patch is inspired by Richard Biener's until-wrap
> loop example in PR tree-optimization/101145.
>
> unsigned foo(unsigned val, unsigned start)
> {
>   unsigned cnt = 0;
>   for (unsigned i = start; i > val; ++i)
> cnt++;
>   return cnt;
> }
>
> For this loop, the tree optimizers currently generate:
>
> unsigned int foo (unsigned int val, unsigned int start)
> {
>   unsigned int cnt;
>   unsigned int _1;
>   unsigned int _5;
>
>[local count: 118111600]:
>   if (start_3(D) > val_4(D))
> goto ; [89.00%]
>   else
> goto ; [11.00%]
>
>[local count: 105119324]:
>   _1 = start_3(D) + 1;
>   _5 = -start_3(D);
>   cnt_2 = _1 > val_4(D) ? _5 : 1;
>
>[local count: 118111600]:
>   # cnt_11 = PHI 
>   return cnt_11;
> }
>
> or perhaps slightly easier to read:
>
>   if (start > val) {
> cnt = (start+1) > val ? -start : 1;
>   } else cnt = 0;
>
> In this snippet, if we know start > val, then (start+1) > val
> unless start+1 overflows, i.e. (start+1) == 0 and start == ~0.
> We can use this (loop header) context to simplify the ternary
> expression to "(start != -1) ? -start : 1", which with a little
> help from match.pd can be folded to -start.  Hence the optimal
> final value replacement should be:
>
>   cnt = (start > val) ? -start : 0;
>
> Or as now generated by this patch:
>
> unsigned int foo (unsigned int val, unsigned int start)
> {
>   unsigned int cnt;
>
>[local count: 118111600]:
>   if (start_3(D) > val_4(D))
> goto ; [89.00%]
>   else
> goto ; [11.00%]
>
>[local count: 105119324]:
>   cnt_2 = -start_3(D);
>
>[local count: 118111600]:
>   # cnt_11 = PHI 
>   return cnt_11;
> }
>
>
> We can also improve until-wrap loops that don't have a (suitable) loop
> header, as determined by simplify_using_initial_conditions.
>
> unsigned bar(unsigned val, unsigned start)
> {
>   unsigned cnt = 0;
>   unsigned i = start;
>   do {
> cnt++;
> i++;
>   } while (i > val);
>   return cnt;
> }
>
> which is currently optimized to:
>
> unsigned int foo (unsigned int val, unsigned int start)
> {
>   unsigned int cnt;
>   unsigned int _9;
>   unsigned int _10;
>
>[local count: 118111600]:
>   _9 = start_4(D) + 1;
>   _10 = -start_4(D);
>   cnt_3 = val_7(D) < _9 ? _10 : 1;
>   return cnt_3;
> }
>
> Here we have "val < (start+1) ? -start : 1", which again with the
> help of match.pd can be slightly simplified to "val <= start ? -start : 1"
> when dealing with unsigned types, because at the complicating value where
> start == ~0, we fortunately have -start == 1, hence it doesn't matter
> whether the second or third operand of the ternary operator is returned.
>
> To summarize, this patch (in addition to tweaking may_be_zero in
> number_of_iterations_until_wrap) adds three new constant folding
> transforms to match.pd.
>
> X != C1 ? -X : C2 simplifies to -X when -C1 == C2.
> which is the generalized form of the simplification above.
>
> X != C1 ? ~X : C2 simplifies to ~X when ~C1 == C2.
> which is the BIT_NOT_EXPR analog of the NEGATE_EXPR case.
>
> and the "until-wrap final value replacement without context":
>
> (X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when
> X is unsigned, as when X + 1 overflows, X is -1, so -X == 1.
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?

+/* X != C1 ? -X : C2 simplifies to -X when -C1 == C2.  */
+(simplify
+ (cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2)
+ (if ((!wi::only_sign_bit_p (wi::to_wide (@1))
+   || TYPE_UNSIGNED (type)
+   || TYPE_OVERFLOW_WRAPS (type)
+   || (!TYPE_OVERFLOW_SANITIZED (type)
+  && !TYPE_OVERFLOW_TRAPS (type)
+  && !TYPE_SATURATING (type)))

I'm wondering about TYPE_UNSIGNED && TYPE_SATURATING.
Also I think unsigned cannot trap or be sanitized and with
TYPE_OVERFLOW_WRAPS sanitizing or trapping cannot
be true either.  Also unsigned types always wrap on overflow.  So maybe

   (if (!TYPE_SATURATING (type)
&& (TYPE_OVERFLOW_WRAPS (type)
   || !wi::only_sign_bit_p (..)))

?

+  && wi::eq_p (wi::neg (wi::to_wide (@1)), wi::to_wide (@2)))
+  @3))

+/* (X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when
+   X is unsigned, as when X + 1 overflows, X is -1, so -X == 1.  */
+(simplify
+ (cond (gt (plus @0 integer_onep) @1) (negate @0) integer_onep@2)
+ (if (TYPE_UNSIGNED (type)
+  && TYPE_UNSIGNED (TREE_TYPE (@0)))

I think the second test is redundant since @0 participates in both
the comparison and the condition value.

+  (cond (ge @0 @1) (negate @0) @2)))

Otherwise looks OK to me.

Thanks,
Richard.

>
> 2021-11-29  Roger Sayle  
>
> gcc/ChangeLog
> * tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
> Check if simplify_using_initial_conditions allows us to
> simplify the expression for may_be_zero.
> * match.pd (X != C ? -X : -C -> -X): New transfor

Re: [PATCH 3/4] libgcc: vxcrtstuff.c: make ctor/dtor functions static

2021-11-30 Thread Olivier Hainque via Gcc-patches

Hi Rasmus,

We had something close but slight different for
the support of shared libraries (for which I'm preparing
the patches). I think your version should work as well
but we have quite a few configurations and the devil is
in the details so I'm testing the effects in a few cases
before approving.

Olivier

> On 1 Nov 2021, at 10:34, Rasmus Villemoes  wrote:
> 
> When the translation unit itself creates pointers to the ctors/dtors
> in a specific section handled by the linker (whether .init_array or
> .ctors.*), there's no reason for the functions to have external
> linkage. That ends up polluting the symbol table in the running
> kernel.
> 
> This makes vxcrtstuff.c on par with the generic crtstuff.c which also
> defines e.g. frame_dummy and __do_global_dtors_aux static.
> 
> libgcc/
>   * config/vxcrtstuff.c: Make constructor and destructor
>   functions static when possible.
> ---
> libgcc/config/vxcrtstuff.c | 10 +++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/libgcc/config/vxcrtstuff.c b/libgcc/config/vxcrtstuff.c
> index c146e1be3be..652a65364b0 100644
> --- a/libgcc/config/vxcrtstuff.c
> +++ b/libgcc/config/vxcrtstuff.c
> @@ -58,14 +58,18 @@ __attribute__((section(__LIBGCC_EH_FRAME_SECTION_NAME__), 
> aligned(4)))
> 
> #define EH_CTOR_NAME _crtbe_register_frame
> #define EH_DTOR_NAME _ctrbe_deregister_frame
> +#define EH_LINKAGE static
> 
> #else
> 
> /* No specific sections for constructors or destructors: we thus use a
>symbol naming convention so that the constructors are then recognized
> -   by munch or whatever tool is used for the final link phase.  */
> +   by munch or whatever tool is used for the final link phase.  Since the
> +   pointers to the constructor/destructor functions are not created in this
> +   translation unit, they must have external linkage.  */
> #define EH_CTOR_NAME _GLOBAL__I_00101_0__crtbe_register_frame
> #define EH_DTOR_NAME _GLOBAL__D_00101_1__crtbe_deregister_frame
> +#define EH_LINKAGE
> 
> #endif
> 
> @@ -88,13 +92,13 @@ __attribute__((section(__LIBGCC_EH_FRAME_SECTION_NAME__), 
> aligned(4)))
> 
> #endif /* USE_INITFINI_ARRAY  */
> 
> -EH_CTOR_ATTRIBUTE void EH_CTOR_NAME (void)
> +EH_LINKAGE EH_CTOR_ATTRIBUTE void EH_CTOR_NAME (void)
> {
>   static struct object object;
>   __register_frame_info (__EH_FRAME_BEGIN__, &object);
> }
> 
> -EH_DTOR_ATTRIBUTE void EH_DTOR_NAME (void)
> +EH_LINKAGE EH_DTOR_ATTRIBUTE void EH_DTOR_NAME (void)
> {
>   __deregister_frame_info (__EH_FRAME_BEGIN__);
> }
> -- 
> 2.31.1
>

Re: [PATCH] [i386] Fix ICE in ix86_attr_length_immediate_default.

2021-11-30 Thread Uros Bizjak via Gcc-patches

On Tue, Nov 30, 2021 at 10:43 AM liuhongt  wrote:
>
> ix86_attr_length_immediate_default assume TYPE ishift only have 1
> constant operand,
> but *x86_64_shld_1/*x86_shld_1/*x86_64_shrd_1/*x86_shrd_1 has 2, with
> condition: INTVAL (operands[3]) == 32 - INTVAL (operands[2]) or
> INTVAL (operands[3]) == 64 - INTVAL (operands[2]), and hit
> gcc_assert.
> Explicitly set_attr length_immediate for these patterns.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
>
> gcc/ChangeLog:
>
> PR target/103463
> PR target/103484
> * config/i386/i386.md (*x86_64_shld_1): Set_attr
> length_immediate to 1.
> (*x86_shld_1): Ditto.
> (*x86_64_shrd_1): Ditto.
> (*x86_shrd_1): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr103463.c: New test.
> * gcc.target/i386/pr103463-2.c: New test.

OK with two testsuite adjustments.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.md|  4 
>  gcc/testsuite/gcc.target/i386/pr103463-2.c | 14 ++
>  gcc/testsuite/gcc.target/i386/pr103463.c   | 13 +
>  3 files changed, 31 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr103463-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr103463.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index c88374c9d2b..4e9fae80479 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -11512,6 +11512,7 @@ (define_insn "*x86_64_shld_1"
>[(set_attr "type" "ishift")
> (set_attr "prefix_0f" "1")
> (set_attr "mode" "DI")
> +   (set_attr "length_immediate" "1")
> (set_attr "athlon_decode" "vector")
> (set_attr "amdfam10_decode" "vector")
> (set_attr "bdver1_decode" "vector")])
> @@ -11573,6 +11574,7 @@ (define_insn "*x86_shld_1"
>"shld{l}\t{%2, %1, %0|%0, %1, %2}"
>[(set_attr "type" "ishift")
> (set_attr "prefix_0f" "1")
> +   (set_attr "length_immediate" "1")
> (set_attr "mode" "SI")
> (set_attr "pent_pair" "np")
> (set_attr "athlon_decode" "vector")
> @@ -12384,6 +12386,7 @@ (define_insn "*x86_64_shrd_1"
>"shrd{q}\t{%2, %1, %0|%0, %1, %2}"
>[(set_attr "type" "ishift")
> (set_attr "prefix_0f" "1")
> +   (set_attr "length_immediate" "1")
> (set_attr "mode" "DI")
> (set_attr "athlon_decode" "vector")
> (set_attr "amdfam10_decode" "vector")
> @@ -12446,6 +12449,7 @@ (define_insn "*x86_shrd_1"
>"shrd{l}\t{%2, %1, %0|%0, %1, %2}"
>[(set_attr "type" "ishift")
> (set_attr "prefix_0f" "1")
> +   (set_attr "length_immediate" "1")
> (set_attr "mode" "SI")
> (set_attr "pent_pair" "np")
> (set_attr "athlon_decode" "vector")
> diff --git a/gcc/testsuite/gcc.target/i386/pr103463-2.c 
> b/gcc/testsuite/gcc.target/i386/pr103463-2.c
> new file mode 100644
> index 000..9c29b70bbd8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr103463-2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile { target { ! ia32 } } } */

Please use { target int128 } here ...

> +/* { dg-options "-O2 -fno-tree-bit-ccp" } */
> +
> +int foo_u64_1;
> +unsigned __int128 foo_u128_1;
> +
> +void
> +foo (void)
> +{
> +  foo_u128_1 <<= 127;
> +  foo_u64_1 += __builtin_sub_overflow_p (0, (long) foo_u128_1, 0);
> +  foo_u128_1 =
> +foo_u128_1 >> (foo_u128_1 & 127) | foo_u128_1 << (-foo_u128_1 & 127);
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr103463.c 
> b/gcc/testsuite/gcc.target/i386/pr103463.c
> new file mode 100644
> index 000..faae9a858e5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr103463.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile { target { ! ia32 } } } */

... and here.

> +/* { dg-options "-Os -fno-tree-dominator-opts -fno-tree-vrp" } */
> +
> +int bar0_u8_0, bar0_u16_0, bar0_u32_0, bar0_u16_1, bar0_u32_1;
> +unsigned __int128 bar0_u128_0;
> +
> +int
> +bar0() {
> +  bar0_u16_1 *=
> +  __builtin_add_overflow_p(bar0_u16_0, bar0_u32_1, (long)bar0_u8_0);
> +  bar0_u128_0 = bar0_u128_0 >> bar0_u16_1 | bar0_u128_0 << (-bar0_u16_1 & 
> 127);
> +  bar0_u128_0 += __builtin_mul_overflow_p(bar0_u32_0, 20, 0);
> +}
> --
> 2.18.1
>

[Committed] PR testsuite/103477: Fix big-endian mistake in new test case.

2021-11-30 Thread Roger Sayle

 

I missed a spot when adding the "#if __BYTE_ORDER__ == ..." guards to

the new test case for PR tree-optimization/103345.  Committed as obvious.

 

 

2021-11-30  Roger Sayle  

 

gcc/testsuite/ChangeLog

PR testsuite/103477

* gcc.dg/tree-ssa/pr103345.c: Correct xor test for big-endian.

 

Roger

--

 

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103345.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr103345.c
index 94388b541c1..dc8810ab5af 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr103345.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103345.c
@@ -42,10 +42,10 @@ uint32_t load_le_32_xor(const uint8_t *ptr)
  ((uint32_t)ptr[2] << 16) ^
  ((uint32_t)ptr[3] << 24);
 #else
-  return ((uint32_t)ptr[0]) ^
- ((uint32_t)ptr[1] << 8) ^
- ((uint32_t)ptr[2] << 16) ^
- ((uint32_t)ptr[3] << 24);
+  return ((uint32_t)ptr[3]) ^
+ ((uint32_t)ptr[2] << 8) ^
+ ((uint32_t)ptr[1] << 16) ^
+ ((uint32_t)ptr[0] << 24);
 #endif
 }

Re: [PATCH] libstdc++: Add [[nodiscard]] to std::byteswap

On Tue, 30 Nov 2021 at 08:58, Jakub Jelinek via Libstdc++
 wrote:
>
> Hi!
>
> This patch adds [[nodiscard]] to std::byteswap, because the function
> template doesn't do anything useful if the result isn't used.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Yes, thanks.

Re: [Patch 1/8, Arm, AArch64, GCC] Refactor mbranch-protection option parsing and make it common to AArch32 and AArch64 backends. [Was RE: [Patch 2/7, Arm, GCC] Add option -mbranch-protection.]

2021-11-30 Thread Andrea Corallo via Gcc-patches

Tejas Belagod via Gcc-patches  writes:

> Ping for this series.
>
> Thanks,
> Tejas.

Hi all,

pinging this series.

BR

  Andrea

Re: [PATCH] Loop unswitching: support gswitch statements.

On Mon, Nov 29, 2021 at 1:45 PM Martin Liška  wrote:
>
> On 11/26/21 09:12, Richard Biener wrote:
> > On Wed, Nov 24, 2021 at 3:32 PM Martin Liška  wrote:
> >>
> >> On 11/24/21 15:14, Martin Liška wrote:
> >>> It likely miscompiles gcc.dg/loop-unswitch-5.c, working on that..
> >>
> >> Fixed that in the updated version.
> >
> > Function level comments need updating it seems.
>
> I've done that.
>
> >
> > +static unsigned
> > +evaluate_insns (class loop *loop,  basic_block *bbs,
> > +   predicate_vector &predicate_path,
> > +   auto_bb_flag &reachable_flag)
> > +{
> > +  auto_vec worklist (loop->num_nodes);
> > +  worklist.quick_push (bbs[0]);
> > ...
> >
> > so when adding gswitch support the easiest way to make
> >
> > +  FOR_EACH_EDGE (e, ei, bb->succs)
> > +   {
> > ...
> > +   {
> > + worklist.safe_push (dest);
> > + dest->flags |= reachable_flag;
> >
> > work is when the gcond/gswitch simplification would mark
> > outgoing edges as (non-)executable.  For gswitch this
> > could be achieved by iterating over the case labels and
> > intersecting that with the range while for gcond it's a
> > matter of setting an edge flag instead of returning true/false.
>
> Exactly, it can be quite naturally added to the current patch.
>
> > I'd call the common function evaluate_control_stmt_using_entry_checks
> > or so and invoke it on the last stmt of a block with >= 2 outgoing
> > edges.
>
> Yes, I'll do it for the gswitch support patch.
>
> >
> > We still seem to do the simplification work twice, once for costing
> > and once for transform, but that's OK for now I guess.
> >
> > I think you want to clear_aux_for_blocks at the end of the pass.
>
> Called that.
>
> >
> > Otherwise I like it - it seems you have some TODO around cost
> > modeling.  Did you try to do gswitch support ontop as I suggested
> > to see if the general structure keeps working?
>
> I vanished and tested the patch. No, I don't have the gswitch support patch
> as the current patch was reworked a few times.
>
> Can we please progress and have installed the suggested patch?

I'd like to see the gswitch support - that's what was posted before stage3
close, this patch on its own doesn't seem worth pushing for.  That said,
I have some comments below (and the already raised ones about how
things might need to change with gswitch support).  Is it so difficult to
develop gswitch support as a separate change ontop of this?

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

+#include 

that's included unconditionally by system.h

+/* The type represents a predicate path leading to a basic block.  */
+typedef auto_vec> predicate_vector;

+static bool tree_unswitch_single_loop (class loop *, int,
+  predicate_vector &predicate_path,

I think we don't want to pass auto_vec by reference, instead auto_vec should
decay to vec<> when passed around.

+  unswitch_predicate *predicate = new unswitch_predicate (cond, lhs);
+  if (irange::supports_type_p (TREE_TYPE (lhs)) && CONSTANT_CLASS_P (rhs))
+{
+  ranger->range_on_edge (predicate->true_range, edge_true, lhs);
+  predicate->false_range = predicate->true_range;

-  return cond;
+  if (!predicate->false_range.varying_p ()
+ && !predicate->false_range.undefined_p ())
+   predicate->false_range.invert ();
+}

is that correct?  I would guess range_on_edge, for

   if (a > 10)
 if (a < 15)
/* true */
 else
/* false */

figures [11, 14] on the true edge of if (a < 15) (considered the
unswitch predicate),
inverting that yields [0, 10] u [15, +INF] but that's at least
sub-optimal for the
else range.  I think we want to call range_on_edge again to determine the range
on the else branch?

 }

-/* Simplifies COND using checks in front of the entry of the LOOP.  Just very
-   simplish (sufficient to prevent us from duplicating loop in unswitching
-   unnecessarily).  */
+static void
+combine_range (predicate_vector &predicate_path, tree index, irange
&path_range)
+{

unless I misread the patch combine_range misses a comment.

+evaluate_control_stmt_using_entry_checks (gimple *stmt,
+ predicate_vector &predicate_path)
 {

so this function for ranger does combine all predicates on the predicate_path
but for the symbolic evaluation it looks at the last predicate only?  I guess
that's because other predicate simplification opportunities are applied already,
correct?  But doesn't that mean that the combine_range could be done once
when we build the predicate vector instead of for each stmt?  I'm just
looking at the difference in treating both cases - if we first analyze the whole
unswitching path (including all recursions) then we'd have to simplify all
opportunities at once, so iterating over all predicates would make sense.
Still merging ranges when pushing the to the predicate vector rather than
for each stmt wou

Re: [PATCH] PR fortran/101565 - ICE in gfc_simplify_image_index, at fortran/simplify.c:8234


Hello,

Le 29/11/2021 à 22:31, Harald Anlauf via Fortran a écrit :

Dear all,

a trivial one: we need to check the type of the SUB argument
to the coarray IMAGE_INDEX intrinsic.  It has to be an array
of type integer.

Patch by Steve Kargl.


I hope at some point he’ll finally come to a working git workflow.


Regtested on x86_64-pc-linux-gnu.  OK for mainline?


Sure.

Re: [PATCH 1v2/3][vect] Add main vectorized loop unrolling

2021-11-30 Thread Andre Vieira (lists) via Gcc-patches




On 25/11/2021 12:46, Richard Biener wrote:

Oops, my fault, yes, it does.  I would suggest to refactor things so
that the mode_i = first_loop_i case is there only once.  I also wonder
if all the argument about starting at 0 doesn't apply to the
not unrolled LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P as well?  So
what's the reason to differ here?  So in the end I'd just change
the existing

   if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
 {

to

   if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)
   || first_loop_vinfo->suggested_unroll_factor > 1)
 {

and maybe revisit this when we have an actual testcase showing that
doing sth else has a positive effect?

Thanks,
Richard.


So I had a quick chat with Richard Sandiford and he is suggesting 
resetting mode_i to 0 for all cases.


He pointed out that for some tunings the SVE mode might come after the 
NEON mode, which means that even for not-unrolled loop_vinfos we could 
end up with a suboptimal choice of mode for the epilogue. I.e. it could 
be that we pick V16QI for main vectorization, but that's VNx16QI + 1 in 
the array, so we'd not try VNx16QI for the epilogue.


This would simplify the mode selecting cases, by just simply restarting 
at mode_i in all epilogue cases. Is that something you'd be OK?


Regards,
Andre

Re: [PATCH] PR fortran/103473 - [11/12 Regression] ICE in simplify_minmaxloc_nodim, at fortran/simplify.c:5287


Le 29/11/2021 à 23:01, Harald Anlauf via Fortran a écrit :

Dear all,

another trivial and obvious one, discovered by Gerhard.

We can have a NULL pointer dereference simplifying MINLOC/MAXLOC
on an array that was not properly declared.

OK for mainline / affected 11-branch after regtesting completes?


Yes, fine as well.

I have the impression that there are quite a number of bugs of this 
kind, and that maybe we could take a more systematic approach to not try 
to simplify something with errors.


Thanks.

Re: [PATCH] [og10] Fix goacc/routine-4-extern.c test

Hi!

On 2020-07-28T10:44:29+0200, I wrote:
> On 2020-07-26T14:05:32+0100, Kwok Cheung Yeung  wrote:
>> On 24/07/2020 8:27 am, Thomas Schwinge wrote:
>>> [proposed patch] however completely defeats what we're intending to test 
>>> here, which
>>> is to "Test invalid intra-routine parallelism".  The same problem has
>>> been introduced in og10 commit 6a0b5806b24bfdefe0b0f3ccbcc51299e5195dca
>>> "Various OpenACC reduction enhancements - test cases" for
>>> 'gcc/testsuite/c-c++-common/goacc/routine-4.c', which throughout changed:
>>>
>>>  -#pragma acc loop gang reduction (+:red) // { dg-error "disallowed by 
>>> containing routine" }
>>>  +#pragma acc loop seq reduction (+:red)
>>>
>>> Please revert that, and instead replace 'reduction (+:red)' with a
>>> different "dummy loop operation" (just an empty loop body?), and in the
>>> commit log state that this should've been included in the respective og10
>>> commit adding the "gang reduction on an orphan loop" checking.
>>
>> I have reverted all the previous changes and replaced the orphan loop gang
>> reductions with empty loops as suggested, and checked that the tests now 
>> pass.
>>
>> Is this version okay for OG10?
>
> Yes, thanks.

... which I've now adapted and pushed to master branch in
commit a83a07557085f6da83c63e86c1cd2e719a39b8b2
"Fix c-c++-common/goacc/routine-4.c and
c-c++-common/goacc/routine-4-extern.c testcases", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From a83a07557085f6da83c63e86c1cd2e719a39b8b2 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Tue, 28 Jul 2020 05:41:14 -0700
Subject: [PATCH] Fix c-c++-common/goacc/routine-4.c and
 c-c++-common/goacc/routine-4-extern.c testcases

... in preparation for checks that we're introducing for OpenACC gang
reductions on orphan loops.

	gcc/testsuite/
	* c-c++-common/goacc/routine-4.c (seq, vector, worker, gang):
	Remove loop reductions.
	* c-c++-common/goacc/routine-4-extern.c (seq, vector, worker, gang):
	Likewise.

Co-Authored-By: Thomas Schwinge 
---
 .../c-c++-common/goacc/routine-4-extern.c | 72 +--
 gcc/testsuite/c-c++-common/goacc/routine-4.c  | 72 +--
 2 files changed, 64 insertions(+), 80 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/goacc/routine-4-extern.c b/gcc/testsuite/c-c++-common/goacc/routine-4-extern.c
index ec21db1c319..ec4475818ad 100644
--- a/gcc/testsuite/c-c++-common/goacc/routine-4-extern.c
+++ b/gcc/testsuite/c-c++-common/goacc/routine-4-extern.c
@@ -26,23 +26,21 @@ void seq (void)
   extern_vector ();  /* { dg-error "routine call uses" } */
   extern_seq ();
 
-  int red;
-
-#pragma acc loop reduction (+:red) // { dg-warning "insufficient partitioning" }
+#pragma acc loop // { dg-warning "insufficient partitioning" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop gang reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop gang // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop worker reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop worker // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop vector reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop vector // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 }
 
 void vector (void)
@@ -52,23 +50,21 @@ void vector (void)
   extern_vector ();
   extern_seq ();
 
-  int red;
-
-#pragma acc loop reduction (+:red)
+#pragma acc loop
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop gang reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop gang // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop worker reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop worker // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop vector reduction (+:red)
+#pragma acc loop vector
   for (int i = 0; i < 10; i++)
-red ++;
+;
 }
 
 void worker (void)
@@ -78,23 +74,21 @@ void worker (void)
   extern_vector ();
   extern_seq ();
 
-  int red;
-
-#pragma acc loop reduction (+:red)
+#pragma acc loop
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop gang reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop gang // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop worker reduc

Re: [gomp4] Make OpenACC orphan gang reductions errors

Hi!

On 2017-05-01T18:27:59-0700, Cesar Philippidis  wrote:
> This patch promotes all OpenACC gang reductions on orphan loops as
> errors. Accord to the spec, orphan loops are those which are not
> lexically nested inside an OpenACC parallel or kernels regions. I.e.,
> acc loops inside acc routines.
>
> At first I thought this could be a warning because the gang reduction
> finalizer uses an atomic update. However, because there is no
> synchronization between gangs, there is way to guarantee that reduction
> will have completed once a single gang entity returns from the acc
> routine call.
>
> I've applied this patch to gomp-4_0-branch.

... which I've now adapted (with several things to be fixed in follow-up
commits) and pushed to master branch in
commit 2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 2b7dac2c0dcb087da9e4018943c023c0678234a3 Mon Sep 17 00:00:00 2001
From: Cesar Philippidis 
Date: Mon, 1 May 2017 18:27:59 -0700
Subject: [PATCH] Make OpenACC orphan gang reductions errors

This patch promotes all OpenACC gang reductions on orphan loops as
errors. Accord to the spec, orphan loops are those which are not
lexically nested inside an OpenACC parallel or kernels regions. I.e.,
acc loops inside acc routines.

At first I thought this could be a warning because the gang reduction
finalizer uses an atomic update. However, because there is no
synchronization between gangs, there is way to guarantee that reduction
will have completed once a single gang entity returns from the acc
routine call.

	gcc/c/
	* c-typeck.c (c_finish_omp_clauses): Emit an error on orphan
	OpenACC gang reductions.
	gcc/cp/
	* semantics.c (finish_omp_clauses): Emit an error on orphan
	OpenACC gang reductions.
	gcc/fortran/
	* openmp.c (oacc_is_parallel, oacc_is_kernels): New 'static'
	functions.
	(resolve_oacc_loop_blocks): Emit an error on orphan OpenACC gang
	reductions.
	gcc/
	* omp-general.h (enum oacc_loop_flags): Add OLF_REDUCTION enum.
	* omp-low.c (lower_oacc_head_mark): Use it to mark OpenACC
	reductions.
	* omp-offload.c (oacc_loop_auto_partitions): Don't assign gang
	level parallelism to orphan reductions.
	gcc/testsuite/
	* c-c++-common/goacc/nested-reductions-1-routine.c: Adjust.
	* c-c++-common/goacc/nested-reductions-2-routine.c: Likewise.
	* gcc.dg/goacc/loop-processing-1.c: Likewise.
	* gfortran.dg/goacc/nested-reductions-1-routine.f90: Likewise.
	* gfortran.dg/goacc/nested-reductions-2-routine.f90: Likewise.
	* c-c++-common/goacc/orphan-reductions-1.c: New test.
	* c-c++-common/goacc/orphan-reductions-2.c: New test.
	* gfortran.dg/goacc/orphan-reductions-1.f90: New test.
	* gfortran.dg/goacc/orphan-reductions-2.f90: New test.
	libgomp/
	* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Temporarily
	skip.

Co-Authored-By: Thomas Schwinge 
---
 gcc/c/c-typeck.c  |   8 +
 gcc/cp/semantics.c|   8 +
 gcc/fortran/openmp.c  |  24 ++
 gcc/omp-general.h |   3 +-
 gcc/omp-low.c |   4 +
 gcc/omp-offload.c |   7 +
 .../goacc/nested-reductions-1-routine.c   |   3 +
 .../goacc/nested-reductions-2-routine.c   |   9 +
 .../c-c++-common/goacc/orphan-reductions-1.c  |  56 +
 .../c-c++-common/goacc/orphan-reductions-2.c  |  87 
 .../gcc.dg/goacc/loop-processing-1.c  |   2 +-
 .../goacc/nested-reductions-1-routine.f90 |   3 +
 .../goacc/nested-reductions-2-routine.f90 |   9 +
 .../gfortran.dg/goacc/orphan-reductions-1.f90 | 206 ++
 .../gfortran.dg/goacc/orphan-reductions-2.f90 |  89 
 .../libgomp.oacc-fortran/parallel-dims.f90|   1 +
 16 files changed, 517 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/orphan-reductions-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/orphan-reductions-2.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/orphan-reductions-1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/orphan-reductions-2.f90

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 7524304f2bd..a025740e618 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -14135,6 +14135,14 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	  goto check_dup_generic;
 
 	case OMP_CLAUSE_REDUCTION:
+	  if (ort == C_ORT_ACC && oacc_get_fn_attrib (current_function_decl)
+	  && omp_find_clause (clauses, OMP_CLAUSE_GANG))
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+			"gang reduction on an orphan loop");
+	  remove = true;
+	  break;
+	}
 	  if (reduction_seen == 0)
 	reduction_seen =

Re: [gomp4] Make OpenACC orphan gang reductions errors

Hi!

On 2017-05-01T18:27:59-0700, Cesar Philippidis  wrote:
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -6090,6 +6090,18 @@ resolve_oacc_loop_blocks (gfc_code *code)

> +  if (code->op == EXEC_OACC_LOOP
> +  && code->ext.omp_clauses->lists[OMP_LIST_REDUCTION]
> +  && code->ext.omp_clauses->gang)
> +{
> +  for (c = omp_current_ctx; c; c = c->previous)
> + if (!oacc_is_loop (c->code))
> +   break;
> +  if (c == NULL || !(oacc_is_parallel (c->code)
> +  || oacc_is_kernels (c->code)))
> +  gfc_error ("gang reduction on an orphan loop at %L", &code->loc);
> +}

To avoid erroneous diagnostics, we also need to handle the OpenACC
'serial' construct here.  I've adapted Kwok's relevant patch, and pushed
to master branch commit f1a58ab0db20c0862e8b5039bd448fc8c9799cac
"[OpenACC] Allow gang reductions inside serial constructs", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From f1a58ab0db20c0862e8b5039bd448fc8c9799cac Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Fri, 13 Mar 2020 11:13:49 -0700
Subject: [PATCH] [OpenACC] Allow gang reductions inside serial constructs

... fixing a regression introduced in the preceding
commit 2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".

	gcc/fortran/
	* openmp.c (oacc_is_serial, oacc_is_parallel_or_serial): New.
	(resolve_oacc_loop_blocks): Use oacc_is_parallel_or_serial instead of
	oacc_is_parallel.
	libgomp/
	* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Remove
	temporary skip.

Co-Authored-By: Thomas Schwinge 
---
 gcc/fortran/openmp.c   | 14 +-
 .../libgomp.oacc-fortran/parallel-dims.f90 |  1 -
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 4fa38691c01..b4100577e51 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -8334,6 +8334,18 @@ oacc_is_kernels (gfc_code *code)
   return code->op == EXEC_OACC_KERNELS || code->op == EXEC_OACC_KERNELS_LOOP;
 }
 
+static bool
+oacc_is_serial (gfc_code *code)
+{
+  return code->op == EXEC_OACC_SERIAL || code->op == EXEC_OACC_SERIAL_LOOP;
+}
+
+static bool
+oacc_is_parallel_or_serial (gfc_code *code)
+{
+  return oacc_is_parallel (code) || oacc_is_serial (code);
+}
+
 static gfc_statement
 omp_code_to_statement (gfc_code *code)
 {
@@ -8644,7 +8656,7 @@ resolve_oacc_loop_blocks (gfc_code *code)
   for (c = omp_current_ctx; c; c = c->previous)
 	if (!oacc_is_loop (c->code))
 	  break;
-  if (c == NULL || !(oacc_is_parallel (c->code)
+  if (c == NULL || !(oacc_is_parallel_or_serial (c->code)
 			 || oacc_is_kernels (c->code)))
 	gfc_error ("gang reduction on an orphan loop at %L", &code->loc);
 }
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
index 80d64030414..fad3d9d6a80 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
@@ -3,7 +3,6 @@
 
 ! { dg-additional-sources parallel-dims-aux.c }
 ! { dg-do run }
-  ! { dg-skip-if TODO { *-*-* } }
 ! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
 
 ! { dg-additional-options "-fopt-info-note-omp" }
-- 
2.33.0

Re: [PATCH] [og10] libgomp, Fortran: Fix OpenACC "gang reduction on an orphan loop" error message

Hi!

On 2020-07-20T12:26:48+0200, Frederik Harwath  wrote:
> Thomas Schwinge  writes:
>>> Can I include the patch in OG10?

> This has been delayed a bit by my vacation, but I have now committed
> the patch.

>> (Ideally, we'd also test 'serial' construct in addition to 'kernels',
>> 'parallel'

> I have included the test cases for the "serial construct".

I've adapted the remaining relevant changes and pushed to master branch
commit c4f4c60457d1657cbd72015de3d818eb6462a0e9
'Re OpenACC "gang reduction on an orphan loop" error message', see
attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From c4f4c60457d1657cbd72015de3d818eb6462a0e9 Mon Sep 17 00:00:00 2001
From: Frederik Harwath 
Date: Mon, 20 Jul 2020 11:24:21 +0200
Subject: [PATCH] Re OpenACC "gang reduction on an orphan loop" error message

Follow-up to preceding commit 2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".

	gcc/fortran/
	* openmp.c (oacc_is_parallel_or_serial): Evolve into...
	(oacc_is_compute_construct): ... this function.
	(resolve_oacc_loop_blocks): Use "oacc_is_compute_construct"
	instead of "oacc_is_parallel_or_serial" for checking that a
	loop is not orphaned.
	gcc/testsuite/
	* gfortran.dg/goacc/orphan-reductions-3.f90: New test
	verifying that the "gang reduction on an orphan loop" error message
	is not emitted for non-orphaned loops.
	* c-c++-common/goacc/orphan-reductions-3.c: Likewise for C and C++.

Co-Authored-By: Thomas Schwinge 
---
 gcc/fortran/openmp.c  |   9 +-
 .../c-c++-common/goacc/orphan-reductions-3.c  | 102 ++
 .../gfortran.dg/goacc/orphan-reductions-3.f90 |  89 +++
 3 files changed, 196 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/orphan-reductions-3.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/orphan-reductions-3.f90

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index b4100577e51..7950c7fb43d 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -8341,9 +8341,11 @@ oacc_is_serial (gfc_code *code)
 }
 
 static bool
-oacc_is_parallel_or_serial (gfc_code *code)
+oacc_is_compute_construct (gfc_code *code)
 {
-  return oacc_is_parallel (code) || oacc_is_serial (code);
+  return (oacc_is_parallel (code)
+	  || oacc_is_kernels (code)
+	  || oacc_is_serial (code));
 }
 
 static gfc_statement
@@ -8656,8 +8658,7 @@ resolve_oacc_loop_blocks (gfc_code *code)
   for (c = omp_current_ctx; c; c = c->previous)
 	if (!oacc_is_loop (c->code))
 	  break;
-  if (c == NULL || !(oacc_is_parallel_or_serial (c->code)
-			 || oacc_is_kernels (c->code)))
+  if (c == NULL || !(oacc_is_compute_construct (c->code)))
 	gfc_error ("gang reduction on an orphan loop at %L", &code->loc);
 }
 
diff --git a/gcc/testsuite/c-c++-common/goacc/orphan-reductions-3.c b/gcc/testsuite/c-c++-common/goacc/orphan-reductions-3.c
new file mode 100644
index 000..cd8ad274ebb
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/orphan-reductions-3.c
@@ -0,0 +1,102 @@
+/* Verify that the error message for gang reduction on orphaned OpenACC loops
+   is not reported for non-orphaned loops. */
+
+/* { dg-additional-options "-Wopenacc-parallelism" } */
+
+int
+kernels (int n)
+{
+  int i, s1 = 0, s2 = 0;
+#pragma acc kernels
+  {
+#pragma acc loop gang reduction(+:s1) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s1 = s1 + 2;
+
+#pragma acc loop gang reduction(+:s2) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s2 = s2 + 2;
+  }
+  return s1 + s2;
+}
+
+int
+parallel (int n)
+{
+  int i, s1 = 0, s2 = 0;
+#pragma acc parallel
+  {
+#pragma acc loop gang reduction(+:s1) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s1 = s1 + 2;
+
+#pragma acc loop gang reduction(+:s2) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s2 = s2 + 2;
+  }
+  return s1 + s2;
+}
+
+int
+serial (int n)
+{
+  int i, s1 = 0, s2 = 0;
+#pragma acc serial /* { dg-warning "region contains gang partitioned code but is not gang partitioned" } */
+  {
+#pragma acc loop gang reduction(+:s1) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s1 = s1 + 2;
+
+#pragma acc loop gang reduction(+:s2) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s2 = s2 + 2;
+  }
+  return s1 + s2;
+}
+
+int
+serial_combined (int n)
+{
+  int i, s1 = 0, s2 = 0;
+#pragma acc serial loop gang reduction(+:s1) /* { dg-bogus "gang reduction on an orphan loop" } */
+  /* { dg-warning "region contains gang partitioned code but is not gang partitioned" "" { target *-*-* } .-1 } */
+  for (i

Re: [gomp4] Make OpenACC orphan gang reductions errors

Hi!

On 2017-05-01T18:27:59-0700, Cesar Philippidis  wrote:
>   gcc/c/
>   * c-typeck.c (c_finish_omp_clauses): Emit an error on orphan OpenACC
>   gang reductions.
>
>   gcc/cp/
>   * semantics.c (finish_omp_clauses): Emit an error on orphan OpenACC
>   gang reductions.
>
>   gcc/fortran/
>   * openmp.c (resolve_oacc_loop_blocks): Emit an error on orphan OpenACC
>   gang reductions.

As a follow-up, I've pushed to master branch
commit 77d24d43644909852998043335b5a0e09d1e8f02
'Consolidate OpenACC "gang reduction on an orphan loop" checking',
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 77d24d43644909852998043335b5a0e09d1e8f02 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 26 Nov 2021 12:29:26 +0100
Subject: [PATCH] Consolidate OpenACC "gang reduction on an orphan loop"
 checking

No need to implement separately in all front ends what we may implement in the
middle end, once for all.

Follow-up to preceding commit 2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".

	gcc/
	* omp-offload.c (oacc_loop_process): Implement "gang reduction on
	an orphan loop" checking.
	gcc/c/
	* c-typeck.c (c_finish_omp_clauses): Remove "gang reduction on an
	orphan loop" checking.
	gcc/cp/
	* semantics.c (finish_omp_clauses): Remove "gang reduction on an
	orphan loop" checking.
	gcc/fortran/
	* openmp.c (resolve_oacc_loop_blocks): Remove "gang reduction on
	an orphan loop" checking.
	(oacc_is_parallel, oacc_is_kernels, oacc_is_serial)
	(oacc_is_compute_construct): Remove.
	gcc/testsuite/
	* gfortran.dg/goacc/orphan-reductions-1.f90: Adjust.
---
 gcc/c/c-typeck.c  |  8 
 gcc/cp/semantics.c|  8 
 gcc/fortran/openmp.c  | 37 ---
 gcc/omp-offload.c | 20 --
 .../gfortran.dg/goacc/orphan-reductions-1.f90 |  8 ++--
 5 files changed, 20 insertions(+), 61 deletions(-)

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index a025740e618..7524304f2bd 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -14135,14 +14135,6 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	  goto check_dup_generic;
 
 	case OMP_CLAUSE_REDUCTION:
-	  if (ort == C_ORT_ACC && oacc_get_fn_attrib (current_function_decl)
-	  && omp_find_clause (clauses, OMP_CLAUSE_GANG))
-	{
-	  error_at (OMP_CLAUSE_LOCATION (c),
-			"gang reduction on an orphan loop");
-	  remove = true;
-	  break;
-	}
 	  if (reduction_seen == 0)
 	reduction_seen = OMP_CLAUSE_REDUCTION_INSCAN (c) ? -1 : 1;
 	  else if (reduction_seen != -2
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index c84caf43251..cd1956497f8 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -6667,14 +6667,6 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	  field_ok = ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP);
 	  goto check_dup_generic;
 	case OMP_CLAUSE_REDUCTION:
-	  if (ort == C_ORT_ACC && oacc_get_fn_attrib (current_function_decl)
-	  && omp_find_clause (clauses, OMP_CLAUSE_GANG))
-	{
-	  error_at (OMP_CLAUSE_LOCATION (c),
-			"gang reduction on an orphan loop");
-	  remove = true;
-	  break;
-	}
 	  if (reduction_seen == 0)
 	reduction_seen = OMP_CLAUSE_REDUCTION_INSCAN (c) ? -1 : 1;
 	  else if (reduction_seen != -2
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 7950c7fb43d..d120be81467 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -8322,31 +8322,6 @@ resolve_omp_do (gfc_code *code)
 }
 }
 
-static bool
-oacc_is_parallel (gfc_code *code)
-{
-  return code->op == EXEC_OACC_PARALLEL || code->op == EXEC_OACC_PARALLEL_LOOP;
-}
-
-static bool
-oacc_is_kernels (gfc_code *code)
-{
-  return code->op == EXEC_OACC_KERNELS || code->op == EXEC_OACC_KERNELS_LOOP;
-}
-
-static bool
-oacc_is_serial (gfc_code *code)
-{
-  return code->op == EXEC_OACC_SERIAL || code->op == EXEC_OACC_SERIAL_LOOP;
-}
-
-static bool
-oacc_is_compute_construct (gfc_code *code)
-{
-  return (oacc_is_parallel (code)
-	  || oacc_is_kernels (code)
-	  || oacc_is_serial (code));
-}
 
 static gfc_statement
 omp_code_to_statement (gfc_code *code)
@@ -8650,18 +8625,6 @@ resolve_oacc_loop_blocks (gfc_code *code)
   if (!oacc_is_loop (code))
 return;
 
-  if (code->op == EXEC_OACC_LOOP
-  && code->ext.omp_clauses->lists[OMP_LIST_REDUCTION]
-  && code->ext.omp_clauses->gang)
-{
-  fortran_omp_context *c;
-  for (c = omp_current_ctx; c; c = c->previous)
-	if (!oacc_is_loop (c->code))
-	  break;
-  if (c == NULL || !(oacc_is_compute_construct (c->code)))
-	gfc_error ("gang reduction on an orphan loop at %L", &code->

[PATCH] c++, v2: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4

On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote:
> It's a DR.  Really, it was intended to be part of C++20; at the Cologne
> meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix
> could go in as part of that paper.

Ok, changed to be done unconditionally now.

> Also, allowing indeterminate values that are never read was in C++20
> (P1331).

Reading P1331R2 again, I'm still puzzled.
Our current behavior (both before and after this patch) is that if
some variable is scalar and has indeterminate value or if an aggregate
variable has some members (possibly nested) with indeterminate values,
in constexpr contexts we allow copying those into other vars of the
same type (e.g. the testcases in the patch below test mere copying
of the whole structures or unsigned char result of __builtin_bit_cast),
but we reject if we actually use them in some other way (e.g. try to
read a member from a variable that has that member indeterminate,
see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an
unsigned char variable.

Then there is P1331R2 which makes the UB on
"an lvalue-to-rvalue conversion that is applied to an object with
indeterminate value ([basic.indet]);"
but isn't even the
  unsigned char a = __builtin_bit_cast (unsigned char, u);
  unsigned char b = a;
case non-constant then when __builtin_bit_cast returns indeterminate value?
__builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens
in that case, so supposely
  unsigned char a = __builtin_bit_cast (unsigned char, u);
is fine, but on
  unsigned char b = a;
a is lvalue and is converted to rvalue.
Similarly
  T t = { 1, 2 };
  S s = __builtin_bit_cast (S, t);
  S u = s;
where S s = __builtin_bit_cast (S, t); could be ok even when some or all
members are indeterminate, but u = s; does lvalue-to-rvalue conversion?

Or there is http://eel.is/c++draft/basic.indet that has quite clear rules
what is and isn't UB and if C++ wanted to go further and allow all those
valid cases in there as constant...

Anyway, I hope this can be dealt with incrementally.

> I think in all of them the result of the cast has (some) indeterminate
> value.  So f1-3 are OK because the indeterminate value has unsigned char
> type and is never used; f4() is non-constant because S::f has
> non-byte-access type and so the new wording says it's undefined.

Ok, implemented the bitfield handling then.

Here is an updated patch, so far lightly tested.

2021-11-30  Jakub Jelinek 

* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
(cxx_eval_bit_cast): Don't error about padding bits if target
type is unsigned char or std::byte, instead return no clearing
ctor.  Use clear_uchar_or_std_byte_in_mask.

* g++.dg/cpp2a/bit-cast11.C: New test.
* g++.dg/cpp2a/bit-cast12.C: New test.
* g++.dg/cpp2a/bit-cast13.C: New test.
* g++.dg/cpp2a/bit-cast14.C: New test.

--- gcc/cp/constexpr.c.jj   2021-11-30 09:44:46.531607444 +0100
+++ gcc/cp/constexpr.c  2021-11-30 12:20:29.105251443 +0100
@@ -4268,6 +4268,121 @@ check_bit_cast_type (const constexpr_ctx
   return false;
 }

+/* Helper function for cxx_eval_bit_cast.  For unsigned char or
+   std::byte members of CONSTRUCTOR (recursively) if they contain
+   some indeterminate bits (as set in MASK), remove the ctor elts,
+   mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the
+   bits in MASK.  */
+
+static void
+clear_uchar_or_std_byte_in_mask (location_t loc, tree t, unsigned char *mask)
+{
+  if (TREE_CODE (t) != CONSTRUCTOR)
+return;
+
+  unsigned i, j = 0;
+  tree index, value;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value)
+{
+  tree type = TREE_TYPE (value);
+  if (TREE_CODE (TREE_TYPE (t)) != ARRAY_TYPE
+ && DECL_BIT_FIELD_TYPE (index) != NULL_TREE)
+   {
+ if (is_byte_access_type (DECL_BIT_FIELD_TYPE (index))
+ && (TYPE_MAIN_VARIANT (DECL_BIT_FIELD_TYPE (index))
+ != char_type_node))
+   {
+ HOST_WIDE_INT fldsz = TYPE_PRECISION (TREE_TYPE (index));
+ gcc_assert (fldsz != 0);
+ HOST_WIDE_INT pos = int_byte_position (index);
+ HOST_WIDE_INT bpos
+   = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (index));
+ bpos %= BITS_PER_UNIT;
+ HOST_WIDE_INT end
+   = ROUND_UP (bpos + fldsz, BITS_PER_UNIT) / BITS_PER_UNIT;
+ gcc_assert (end == 1 || end == 2);
+ unsigned char *p = mask + pos;
+ unsigned char mask_save[2];
+ mask_save[0] = mask[pos];
+ mask_save[1] = end == 2 ? mask[pos + 1] : 0;
+ if (BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
+   sorry_at (loc, "PDP11 bit-field handling unsupported"
+  " in %qs", "__builtin_bit_cast");
+ else if (BYTES_BIG_ENDIAN)
+   {
+ /* Big endian.  */
+

Re: Gang-level reductions in OpenACC routine

Hi!

On 2020-03-19T17:12:02+, Kwok Cheung Yeung  wrote:
> On 18/03/2020 11:34 pm, Kwok Cheung Yeung wrote:
>> I was looking at the regression in c-c++-common/goacc/nested-reductions.c, 
>> which
>> has the following excess warnings in acc_routine:
>>
>> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-c++-common/goacc/nested-reductions.c:360:15:
>> warning: insufficient partitioning available to parallelize loop
>> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-c++-common/goacc/nested-reductions.c:369:17:
>> warning: insufficient partitioning available to parallelize loop
>> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-c++-common/goacc/nested-reductions.c:375:17:
>> warning: insufficient partitioning available to parallelize loop
>> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-c++-common/goacc/nested-reductions.c:320:6:
>> warning: region is gang partitioned but does not contain gang partitioned 
>> code
>>
>> It is caused by the following code in the patch 'Make OpenACC orphan
>> gang reductions errors"] (originally by Cesar):
>>
>> +  /* Orphan reductions cannot have gang partitioning.  */
>> +  if ((loop->flags & OLF_REDUCTION)
>> + && oacc_get_fn_attrib (current_function_decl)
>> + && !lookup_attribute ("omp target entrypoint",
>> +   DECL_ATTRIBUTES (current_function_decl)))
>> +   this_mask = GOMP_DIM_MASK (GOMP_DIM_WORKER);

Right.  However, that code doesn't implement what the OpenACC
specification actually says.  ;-)

>> The problem is that acc_routine is not declared with 'omp target entrypoint',
>> but it does have '#pragma acc_routine gang' applied to it. From what I
>> understand of the OpenACC spec, this means that the function can be called 
>> from
>> the accelerator, and may contain a loop at the gang-level.

Right.

>> So is allowing gang
>> reductions for functions with '#pragma acc_routine gang' (but not for worker 
>> or
>> vector) the right thing to do here?

No, that's precisely the thing that the compiler needs to diagnose.  See
OpenACC 2.6, 2.9.11. "reduction clause", which places a restriction such
that "The 'reduction' clause may not be specified on an orphaned 'loop'
construct with the 'gang' clause, or on an orphaned 'loop' construct that
will generate gang parallelism in a procedure that is compiled with the
'routine gang' clause."  */

Cesar apparently read the last part to mean that inside a 'routine gang',
a 'loop reduction' with implicit 'gang' level of parallelism should be
demoted to 'worker' level of parallelism.  But what actually is meant,
simply, is that in such cases we raise the same "gang reduction on an
orphan loop" error diagnostic that we raise for explicit 'gang' level of
parallelism.  (..., and adjust our offending test cases).

Now, re your og10 etc. change:

> Allow gang-level reductions in OpenACC routines with gang-level 
> parallelism

>   gcc/
>   * omp-offload.c (oacc_loop_auto_partitions): Check for 'omp declare
>   target' attributes with a gang clause attached.

> --- a/gcc/omp-offload.c
> +++ b/gcc/omp-offload.c
> @@ -1374,14 +1374,32 @@ oacc_loop_auto_partitions (oacc_loop *loop, unsigned 
> outer_mask,

>/* Orphan reductions cannot have gang partitioning.  */
>if ((loop->flags & OLF_REDUCTION)
> -   && oacc_get_fn_attrib (current_function_decl)
> -   && !lookup_attribute ("omp target entrypoint",
> +   && oacc_get_fn_attrib (current_function_decl))
> + {
> +   bool gang_p = false;
> +   tree attr
> +   = lookup_attribute ("omp declare target",
> +   DECL_ATTRIBUTES (current_function_decl));
> +
> +   if (attr)
> + for (tree c = TREE_VALUE (attr); c; c = OMP_CLAUSE_CHAIN (c))
> +   if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_GANG)
> + {
> +   gang_p = true;
> +   break;
> + }
> +
> +   if (lookup_attribute ("omp target entrypoint",
>   DECL_ATTRIBUTES (current_function_decl)))
> - this_mask = GOMP_DIM_MASK (GOMP_DIM_WORKER);
> + gang_p = true;
> +
> +   if (!gang_p)
> + this_mask = GOMP_DIM_MASK (GOMP_DIM_WORKER);
> + }

..., I don't understand what exactly that is meant to do: as far as I can
tell, we always get 'gang_p == true' from that code?

Instead, I've pushed to master branch
commit 365cd5f9ba812c389b404a53d99ab5dded5097f4 '[OpenACC] Remove
erroneous "Orphan reductions cannot have gang partitioning" handling',
see attached.  This implements the desired "gang reduction on an orphan
loop" error diagnostics also for these implicit 'gang' cases, via the
middle-end checking that I've just added in
commit 77d24d43644909852998043335b5a0e09d1e8f02
'Consolidate OpenACC "gang reduction on an orphan loop" checking'.

Grüße
 Thomas

-
Siemens Electronic Design Auto

gender-agnostic pronouns

2021-11-30 Thread Nathan Sidwell

I've committed this change to use gneder agnostic pronouns on the 
non-historical web documents.


and if you're upset that Those Are Plural!, assemble this URL and watch 
youtube  /watch?v=46ehrFk-gLk&t=87s at about the 2 minute mark


nathan
--
Nathan SidwellFrom b5a0f250f0f05364a51c331d040d78bf15057884 Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Tue, 30 Nov 2021 07:12:44 -0500
Subject: [PATCH] Use gender-agnostic pronouns

Use they/them/their in non-historical documents
---
 htdocs/bugs/management.html | 6 +++---
 htdocs/contribute.html  | 2 +-
 htdocs/develop.html | 2 +-
 htdocs/fortran/index.html   | 4 ++--
 htdocs/gitwrite.html| 2 +-
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/htdocs/bugs/management.html b/htdocs/bugs/management.html
index 18fee991..97ef8299 100644
--- a/htdocs/bugs/management.html
+++ b/htdocs/bugs/management.html
@@ -203,7 +203,7 @@ fixing (the rationale is that a patch will have to go to the newest
 release branch before any other release branch).
 The priority of a regression should initially be set to P3.
 The milestone and the priority can
-be changed by the release manager and his/her delegates.
+be changed by the release manager and their delegates.
 
 If a patch fixing a PR has been submitted, a link
 to the message with the patch should be added to the PR, as well as the
@@ -224,8 +224,8 @@ release versions) should get "minor" severity and the additional keyword
 
 Bugs in component "bootstrap" that refer to older
 releases or snapshots/CVS versions should be put into state "WAITING",
-asking the reporter whether she can still reproduce the problem and to
-report her findings in any case (whether positive or negative).
+asking the reporter whether they can still reproduce the problem and to
+report their findings in any case (whether positive or negative).
 
 
 If the response is "works now", close the report,
diff --git a/htdocs/contribute.html b/htdocs/contribute.html
index 423ce9de..c0223738 100644
--- a/htdocs/contribute.html
+++ b/htdocs/contribute.html
@@ -397,7 +397,7 @@ to point out lack of write access in your initial submission, too.
 
 Announcing Changes (to our Users)
 
-Everything that requires a user to edit his Makefiles or his source code
+Everything that requires a user to edit their Makefiles or source code
 is a good candidate for being mentioned in the release notes.
 
 Larger accomplishments, either as part of a specific project, or long
diff --git a/htdocs/develop.html b/htdocs/develop.html
index 4b1f9468..9880ad42 100644
--- a/htdocs/develop.html
+++ b/htdocs/develop.html
@@ -60,7 +60,7 @@ branch in the publicly accessible GCC development tree.)
 
 
 There is no firm guideline for what constitutes a "major change"
-and what does not.  If a developer is unsure, he or she should ask for
+and what does not.  If a developer is unsure, they should ask for
 guidance on the GCC mailing lists.  In general, a change that has the
 potential to be extremely destabilizing should be done on a branch.
 
diff --git a/htdocs/fortran/index.html b/htdocs/fortran/index.html
index 1d140b3a..1984a297 100644
--- a/htdocs/fortran/index.html
+++ b/htdocs/fortran/index.html
@@ -117,11 +117,11 @@ changes.
 Approval should be necessary for
 patches which don't fall under the obvious rule. So, with the approver list
 put in place, everybody (except maintainers) should still seek approval for 
-his/her patches.  We have found the mutual peer review process really 
+their patches.  We have found the mutual peer review process really 
 works well.
 Patches should only be reviewed by
 people who know the affected parts of the compiler. (i.e. the
-reviewer has to be sure he/she knows stuff well enough to make a
+reviewer has to be sure they know stuff well enough to make a
 good judgment.)
 Large/complicated patches should
 still go by one of our maintainers, or team consensus.
diff --git a/htdocs/gitwrite.html b/htdocs/gitwrite.html
index 92740209..9de5de27 100644
--- a/htdocs/gitwrite.html
+++ b/htdocs/gitwrite.html
@@ -37,7 +37,7 @@ is not sufficient).
 
 If you already have an account on sourceware.org / gcc.gnu.org, ask
 overse...@gcc.gnu.org to add access to the GCC repository.
-Include the name of your sponsor and CC: her.
+Include the name of your sponsor and CC: them.
 Otherwise use https://sourceware.org/cgi-bin/pdw/ps_form.cgi";>this form,
 again specifying your sponsor.
-- 
2.31.1

Re: [PATCH] Avoid some -Wunreachable-code-ctrl


Le 29/11/2021 à 16:03, Richard Biener via Gcc-patches a écrit :

diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index f5ba7cecd54..16ee2afc9c0 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn, 
void *data)
  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
-   break;
  case EXPR_FUNCTION:
for (a = (*e)->value.function.actual; a; a = a->next)
  WALK_SUBEXPR (a->expr);


I’m uncomfortable with the above change.
It makes it look like there is a fall through, but there is not.
Maybe inline the macro to make the continue explicit, or use 
WALK_SUBEXPR instead of WALK_SUBEXPR_TAIL and hope the compiler will do 
the tail call optimization.


Mikael

[committed] libstdc++: Use gender-agnostic pronoun in docs

I've pushed this change for the libstdc++ docs (should be "their"), but
didn't notice the typo in the changelog, so I'll fix that tomorrow after
the file is regenerated.



libstdc++-v3/ChangeLog:

* doc/xml/manual/debug_mode.xml: Replace "his or her" with "they".
* doc/html/manual/debug_mode_design.html: Regenerate.
---
 libstdc++-v3/doc/html/manual/debug_mode_design.html | 10 +-
 libstdc++-v3/doc/xml/manual/debug_mode.xml  | 10 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/libstdc++-v3/doc/xml/manual/debug_mode.xml 
b/libstdc++-v3/doc/xml/manual/debug_mode.xml
index dbd5c2b7775..988c4a93601 100644
--- a/libstdc++-v3/doc/xml/manual/debug_mode.xml
+++ b/libstdc++-v3/doc/xml/manual/debug_mode.xml
@@ -393,14 +393,14 @@ That alias is deprecated and may be removed in a future 
release.
 less recompilation) but are more complicated to implement than
 the lower-numbered conformance levels.
   
-   Full recompilation: The user must 
recompile his or
-   her entire application and all C++ libraries it depends on,
+   Full recompilation: The user must 
recompile
+   their entire application and all C++ libraries it depends on,
including the C++ standard library that ships with the
compiler. This must be done even if only a small part of the
program can use debugging features.
 
Full user recompilation: The user 
must recompile
-   his or her entire application and all C++ libraries it depends
+   their entire application and all C++ libraries it depends
on, but not the C++ standard library itself. This must be done
even if only a small part of the program can use debugging
features. This can be achieved given a full recompilation
@@ -409,7 +409,7 @@ That alias is deprecated and may be removed in a future 
release.
one, e.g., a multilibs approach.
 
Partial recompilation: The user 
must recompile the
-   parts of his or her application and the C++ libraries it
+   parts of their application and the C++ libraries it
depends on that will use the debugging facilities
directly. This means that any code that uses the debuggable
standard containers would need to be recompiled, but code
@@ -417,7 +417,7 @@ That alias is deprecated and may be removed in a future 
release.
would not have to be recompiled.
 
Per-use recompilation: The user 
must recompile the
-   parts of his or her application and the C++ libraries it
+   parts of their application and the C++ libraries it
depends on where debugging should occur, and any other code
that interacts with those containers. This means that a set of
translation units that accesses a particular standard
-- 
2.31.1

Re: [PATCH] Avoid some -Wunreachable-code-ctrl

On Tue, 30 Nov 2021, Mikael Morin wrote:

> Le 29/11/2021 à 16:03, Richard Biener via Gcc-patches a écrit :
> > diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
> > index f5ba7cecd54..16ee2afc9c0 100644
> > --- a/gcc/fortran/frontend-passes.c
> > +++ b/gcc/fortran/frontend-passes.c
> > @@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn,
> > void *data)
> >  case EXPR_OP:
> >WALK_SUBEXPR ((*e)->value.op.op1);
> >WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
> > -   break;
> >  case EXPR_FUNCTION:
> >for (a = (*e)->value.function.actual; a; a = a->next)
> >  WALK_SUBEXPR (a->expr);
> 
> I’m uncomfortable with the above change.
> It makes it look like there is a fall through, but there is not.
> Maybe inline the macro to make the continue explicit, or use WALK_SUBEXPR
> instead of WALK_SUBEXPR_TAIL and hope the compiler will do the tail call
> optimization.

Ah, it follows the style in tree.c:walk_tree_1 where break was used
inconsistently after WALK_SUBTREE_TAIL which was then more obvious
to me to clean up.  I didn't realize the fortran FE only had a 
single WALK_SUBEXPR_TAIL.

I'm not sure inlining will make the situation more clear, for
sure using WALK_SUBEXPR would but it might loose the tailcall.

Would you accept an additional comment after WALK_SUBEXPR_TAIL like

  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
/* tail-recurse  */

?  Btw, a fallthru would be diagnosed by GCC unless we put

/* Fallthru  */

here.  Maybe renaming WALK_SUBEXPR_TAIL to WALK_SUBEXPR_WITH_CONTINUE
or WALK_SUBEXPR_BY_TAIL_RECURSING or WALK_SUBEXPR_TAILRECURSE would
be more obvious?

Thanks,
Richard.

Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-30 Thread Marek Polacek via Gcc-patches

On Tue, Nov 30, 2021 at 09:38:57AM +0100, Stephan Bergmann wrote:
> On 15/11/2021 18:28, Marek Polacek via Gcc-patches wrote:
> > On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > > Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
> > > but changing the name is a trivial operation.
> > 
> > Here's a patch with a better name (suggested by Jonathan W.).  Otherwise no
> > changes.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> >  From a link below:
> > "An issue was discovered in the Bidirectional Algorithm in the Unicode
> > Specification through 14.0. It permits the visual reordering of
> > characters via control sequences, which can be used to craft source code
> > that renders different logic than the logical ordering of tokens
> > ingested by compilers and interpreters. Adversaries can leverage this to
> > encode source code for compilers accepting Unicode such that targeted
> > vulnerabilities are introduced invisibly to human reviewers."
> > 
> > More info:
> > https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> > https://trojansource.codes/
> > 
> > This is not a compiler bug.  However, to mitigate the problem, this patch
> > implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
> > misleading Unicode bidirectional characters the preprocessor may encounter.
> > 
> > The default is =unpaired, which warns about improperly terminated
> > bidirectional characters; e.g. a LRE without its appertaining PDF.  The
> > level =any warns about any use of bidirectional characters.
> > 
> > This patch handles both UCNs and UTF-8 characters.  UCNs designating
> > bidi characters in identifiers are accepted since r204886.  Then r217144
> > enabled -fextended-identifiers by default.  Extended characters in C/C++
> > identifiers have been accepted since r275979.  However, this patch still
> > warns about mixing UTF-8 and UCN bidi characters; there seems to be no
> > good reason to allow mixing them.
> 
> I wonder what the rationale is to warn about UCNs, like in
> 
> > aText = u"\u202D" + aText;
> 
> (as found in the LibreOffice source code).

Is this line mixing a UCN and a UTF-8?  Or is it just that you're
prepending a LRO to aText?  We warn because the LRO is not "closed"
in the context of its string literal, which was part of the Trojan
source attack.  So "\u202D ... \u202C" would not warn.

I'm not sure what workaround I could offer.  Maybe provide an option not to
warn about UCNs at all, though even that is potentially dangerous -- while
you can see UCNs in the source code, if you print strings containing them,
they won't be visible anymore.

Marek

Re: [PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

2021-11-30 Thread David Edelsohn via Gcc-patches

On Tue, Nov 30, 2021 at 3:46 AM HAO CHEN GUI  wrote:
>
> Hi,
>
> This patch modifies the combine pattern with a helper - 
> change_pseudo_and_mask when recog fails. The helper converts a single pseudo 
> to the pseudo and with a mask if the outer operator is IOR/XOR/PLUS and the 
> inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match shift + ior 
> pattern.
>
> Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. 
> Is this okay for trunk? Any recommendations? Thanks a lot.
>
> ChangeLog
>
> 2021-11-30 Haochen Gui 
>
> gcc/
> * combine.c (change_pseudo_and_mask): New.
> (recog_for_combine): If recog fails, try again with the pattern
> modified by change_pseudo_and_mask.
>
> gcc/testsuite/
> * gcc.target/powerpc/20050603-3.c: Modify the dump check conditions.
> * gcc.target/powerpc/rlwimi-2.c: Likewise.
>
> patch.diff
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 03e9a780919..c83c0aceb57 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -11539,6 +11539,42 @@ change_zero_ext (rtx pat)
>return changed;
>  }
>
> +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is
> +   ASHIFT/LSHIFTRT/AND, convert a psuedo to psuedo AND with a mask if its

^^^ spelling mistake in comment: pseudo not psuedo

Thanks, David

> +   nonzero_bits is less than its mode mask.  */
> +static bool
> +change_pseudo_and_mask (rtx pat)
> +{
> +  bool changed = false;
> +
> +  rtx src = SET_SRC (pat);
> +  if ((GET_CODE (src) == IOR
> +   || GET_CODE (src) == XOR
> +   || GET_CODE (src) == PLUS)
> +  && (((GET_CODE (XEXP (src, 0)) == ASHIFT
> +   || GET_CODE (XEXP (src, 0)) == LSHIFTRT
> +   || GET_CODE (XEXP (src, 0)) == AND)
> +  && REG_P (XEXP (src, 1)))
> + || ((GET_CODE (XEXP (src, 1)) == ASHIFT
> +  || GET_CODE (XEXP (src, 1)) == LSHIFTRT
> +  || GET_CODE (XEXP (src, 1)) == AND)
> + && REG_P (XEXP (src, 0)
> +{
> +  rtx *reg = REG_P (XEXP (src, 0))
> +? &XEXP (SET_SRC (pat), 0)
> +: &XEXP (SET_SRC (pat), 1);
> +  machine_mode mode = GET_MODE (*reg);
> +  unsigned HOST_WIDE_INT nonzero = nonzero_bits (*reg, mode);
> +  if (nonzero < GET_MODE_MASK (mode))
> +   {
> + rtx x = gen_rtx_AND (mode, *reg, GEN_INT (nonzero));
> + SUBST (*reg, x);
> + changed = true;
> +   }
> + }
> +  return changed;
> +}
> +
>  /* Like recog, but we receive the address of a pointer to a new pattern.
> We try to match the rtx that the pointer points to.
> If that fails, we may try to modify or replace the pattern,
> @@ -11586,7 +11622,14 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx 
> *pnotes)
> }
> }
>else
> -   changed = change_zero_ext (pat);
> +   {
> + if (change_pseudo_and_mask (pat))
> +   {
> + maybe_swap_commutative_operands (SET_SRC (pat));
> + changed = true;
> +   }
> + changed |= change_zero_ext (pat);
> +   }
>  }
>else if (GET_CODE (pat) == PARALLEL)
>  {
> diff --git a/gcc/testsuite/gcc.target/powerpc/20050603-3.c 
> b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> index 4017d34f429..e628be11532 100644
> --- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> @@ -12,7 +12,7 @@ void rotins (unsigned int x)
>b.y = (x<<12) | (x>>20);
>  }
>
> -/* { dg-final { scan-assembler-not {\mrlwinm} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */
>  /* { dg-final { scan-assembler-not {\mrldic} } } */
>  /* { dg-final { scan-assembler-not {\mrot[lr]} } } */
>  /* { dg-final { scan-assembler-not {\ms[lr][wd]} } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c 
> b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> index bafa371db73..ffb5f9e450f 100644
> --- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> @@ -2,14 +2,14 @@
>  /* { dg-options "-O2" } */
>
>  /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 14121 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 20217 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 21279 { target lp64 } } 
> } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+blr} 6750 } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 643 { target ilp32 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 11 { target lp64 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 7790 { target lp64 } } 
> } */
>
>  /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target lp64 } } 
> } */
>
>  /* { dg-fina

[PATCH][pushed] Change if-to-switch-conversion test.

2021-11-30 Thread Martin Liška


Small update of the test-case, approved by Richi.

Martin

PR tree-optimization/103278

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/if-to-switch-5.c: Make the test acceptable by
targets with no jump-tables.
---
 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c 
b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c
index ceeae908821..54771e64e59 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c
@@ -4,8 +4,8 @@
 int crud (unsigned char c)
 {
   return (((int) c == 46) || (int) c == 44)
-|| (int) c == 58) || (int) c == 59) || (int) c == 60)
- || (int) c == 62) || (int) c == 34) || (int) c == 92)
+|| (int) c == 58) || (int) c == 60) || (int) c == 62)
+ || (int) c == 64) || (int) c == 34) || (int) c == 92)
   || (int) c == 39) != 0);
 }
 
--

2.34.0

Re: [PATCH 1v2/3][vect] Add main vectorized loop unrolling

On Tue, 30 Nov 2021, Andre Vieira (lists) wrote:

> 
> On 25/11/2021 12:46, Richard Biener wrote:
> > Oops, my fault, yes, it does.  I would suggest to refactor things so
> > that the mode_i = first_loop_i case is there only once.  I also wonder
> > if all the argument about starting at 0 doesn't apply to the
> > not unrolled LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P as well?  So
> > what's the reason to differ here?  So in the end I'd just change
> > the existing
> >
> >if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> >  {
> >
> > to
> >
> >if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)
> >|| first_loop_vinfo->suggested_unroll_factor > 1)
> >  {
> >
> > and maybe revisit this when we have an actual testcase showing that
> > doing sth else has a positive effect?
> >
> > Thanks,
> > Richard.
> 
> So I had a quick chat with Richard Sandiford and he is suggesting resetting
> mode_i to 0 for all cases.
> 
> He pointed out that for some tunings the SVE mode might come after the NEON
> mode, which means that even for not-unrolled loop_vinfos we could end up with
> a suboptimal choice of mode for the epilogue. I.e. it could be that we pick
> V16QI for main vectorization, but that's VNx16QI + 1 in the array, so we'd not
> try VNx16QI for the epilogue.
> 
> This would simplify the mode selecting cases, by just simply restarting at
> mode_i in all epilogue cases. Is that something you'd be OK?

Works for me with an updated comment.  Even better with showing a
testcase exercising such tuning.

Richard.

[PATCH] tree-optimization/103489 - fix ICE when bool pattern recog fails

bool pattern recog currently does not handle cycles correctly
and when it fails we can ICE later vectorizing PHIs with
mismatched bool and non-bool vector types.  The following avoids
blindly trusting bool pattern recog here and verifies things
more thoroughly in vectorizable_phi.  A bool pattern recog fix
is for GCC 13.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-11-30  Richard Biener  

PR tree-optimization/103489
* tree-vect-loop.c (vectorizable_phi): Verify argument
vector type compatibility to mitigate bool pattern recog
bug.

* gcc.dg/torture/pr103489.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr103489.c | 12 
 gcc/tree-vect-loop.c| 18 ++
 2 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr103489.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr103489.c 
b/gcc/testsuite/gcc.dg/torture/pr103489.c
new file mode 100644
index 000..cd62623ece2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr103489.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-vectorize" } */
+
+_Bool a[80];
+short b, f;
+void g(short h[][8][16])
+{
+  for (_Bool c = 0; c < b;)
+for (_Bool d = 0; d < (_Bool)f; d = 1)
+  for (short e = 0; e < 16; e++)
+a[e] = h[b][1][e];
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 841da78f1fd..7f544ba1fd5 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -7846,6 +7846,24 @@ vectorizable_phi (vec_info *,
   "incompatible vector types for invariants\n");
return false;
  }
+   else if (SLP_TREE_DEF_TYPE (child) == vect_internal_def
+&& !useless_type_conversion_p (vectype,
+   SLP_TREE_VECTYPE (child)))
+ {
+   /* With bools we can have mask and non-mask precision vectors,
+  while pattern recog is supposed to guarantee consistency here
+  bugs in it can cause mismatches (PR103489 for example).
+  Deal with them here instead of ICEing later.  */
+   if (dump_enabled_p ())
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+  "incompatible vector type setup from "
+  "bool pattern detection\n");
+   gcc_checking_assert
+ (VECTOR_BOOLEAN_TYPE_P (SLP_TREE_VECTYPE (child))
+  != VECTOR_BOOLEAN_TYPE_P (vectype));
+   return false;
+ }
+
   /* For single-argument PHIs assume coalescing which means zero cost
 for the scalar and the vector PHIs.  This avoids artificially
 favoring the vector path (but may pessimize it in some cases).  */
-- 
2.31.1

[PATCH] simplify-rtx, v2: Punt on simplify_associative_operation with large operands [PR102356]

On Tue, Nov 30, 2021 at 10:43:28AM +0100, Richard Biener wrote:
> I wonder given we now have 'simplify_context' whether we can
> track a re-association budget we can eat from.  At least your
> code to determine whether the expression is too large is
> quadratic as well (but bound to 64, so just a very large constant
> overhead for an outermost expression of size 63).  We already
> have a mem_depth there,

Makes sense.

> so just have reassoc_times and punt
> if that reaches --param max-simplify-reassoc-times, incrementing
> it each time simplify_associative_operation is entered?

Though, is a --param worth for it?  There is IMO no way the 64 limit
can trigger for non-debug insns (I can certainly gather how many times
it triggers when > 20 and in which pass during bootstrap/regtest
to verify).

2021-11-30  Jakub Jelinek  

PR rtl-optimization/102356
* rtl.h (simplify_context): Add assoc_count member.
* simplify-rtx.c (simplify_associative_operation): Don't reassociate
more than 64 times within one outermost simplify_* call.
* dwarf2out.c (mem_loc_descriptor): Optimize binary operation
with both operands the same using DW_OP_dup.

* gcc.dg/pr102356.c: New test.

--- gcc/rtl.h.jj2021-11-02 09:06:05.904396581 +0100
+++ gcc/rtl.h   2021-11-30 14:55:39.701257736 +0100
@@ -3433,6 +3433,10 @@ public:
  inside a MEM than outside.  */
   unsigned int mem_depth = 0;
 
+  /* Tracks number of simplify_associative_operation calls performed during
+ outermost simplify* call.  */
+  unsigned int assoc_count = 0;
+
 private:
   rtx simplify_truncation (machine_mode, rtx, machine_mode);
   rtx simplify_byte_swapping_operation (rtx_code, machine_mode, rtx, rtx);
--- gcc/simplify-rtx.c.jj   2021-11-30 09:44:46.619606170 +0100
+++ gcc/simplify-rtx.c  2021-11-30 14:59:00.251321577 +0100
@@ -2263,6 +2263,16 @@ simplify_context::simplify_associative_o
 {
   rtx tem;
 
+  /* Normally expressions simplified by simplify-rtx.c are combined
+ at most from a few machine instructions and therefore the
+ expressions should be fairly small.  During var-tracking
+ we can see arbitrarily large expressions though and reassociating
+ those can be quadratic, so punt after encountering 64
+ simplify_associative_operation calls during outermost simplify_*
+ call.  */
+  if (++assoc_count >= 64)
+return NULL_RTX;
+
   /* Linearize the operator to the left.  */
   if (GET_CODE (op1) == code)
 {
--- gcc/dwarf2out.c.jj  2021-11-30 09:44:46.568606908 +0100
+++ gcc/dwarf2out.c 2021-11-30 14:53:28.779174490 +0100
@@ -16363,6 +16363,15 @@ mem_loc_descriptor (rtx rtl, machine_mod
 do_binop:
   op0 = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode,
VAR_INIT_STATUS_INITIALIZED);
+  if (XEXP (rtl, 0) == XEXP (rtl, 1))
+   {
+ if (op0 == 0)
+   break;
+ mem_loc_result = op0;
+ add_loc_descr (&mem_loc_result, new_loc_descr (DW_OP_dup, 0, 0));
+ add_loc_descr (&mem_loc_result, new_loc_descr (op, 0, 0));
+ break;
+   }
   op1 = mem_loc_descriptor (XEXP (rtl, 1), mode, mem_mode,
VAR_INIT_STATUS_INITIALIZED);
 
--- gcc/testsuite/gcc.dg/pr102356.c.jj  2021-11-30 14:53:28.779174490 +0100
+++ gcc/testsuite/gcc.dg/pr102356.c 2021-11-30 14:53:28.779174490 +0100
@@ -0,0 +1,33 @@
+/* PR rtl-optimization/102356 */
+/* { dg-do compile { target int32plus } } */
+/* { dg-options "-O3 -g" } */
+
+signed char a = 0;
+unsigned char b = 9;
+unsigned long long c = 0xF1FBFC17225F7A57ULL;
+int d = 0x3A6667C6;
+
+unsigned char
+foo (unsigned int x)
+{
+  unsigned int *e = &x;
+  if ((c /= ((0 * (*e *= b)) <= 0)))
+;
+  for (d = 9; d > 2; d -= 2)
+{
+  c = -2;
+  do
+   if ((*e *= *e))
+ {
+   a = 4;
+   do
+ {
+   a -= 3;
+   if ((*e *= *e))
+ b = 9;
+ }
+   while (a > 2);
+ }
+  while (c++);
+}
+}


Jakub

Re: [PATCH] simplify-rtx, v2: Punt on simplify_associative_operation with large operands [PR102356]

On Tue, 30 Nov 2021, Jakub Jelinek wrote:

> On Tue, Nov 30, 2021 at 10:43:28AM +0100, Richard Biener wrote:
> > I wonder given we now have 'simplify_context' whether we can
> > track a re-association budget we can eat from.  At least your
> > code to determine whether the expression is too large is
> > quadratic as well (but bound to 64, so just a very large constant
> > overhead for an outermost expression of size 63).  We already
> > have a mem_depth there,
> 
> Makes sense.
> 
> > so just have reassoc_times and punt
> > if that reaches --param max-simplify-reassoc-times, incrementing
> > it each time simplify_associative_operation is entered?
> 
> Though, is a --param worth for it?  There is IMO no way the 64 limit
> can trigger for non-debug insns (I can certainly gather how many times
> it triggers when > 20 and in which pass during bootstrap/regtest
> to verify).

Probably not - but maybe use a (static) const unsigned int max_assoc_count
in the class then?

OK either way I guess.

Thanks,
Richard.

> 2021-11-30  Jakub Jelinek  
> 
>   PR rtl-optimization/102356
>   * rtl.h (simplify_context): Add assoc_count member.
>   * simplify-rtx.c (simplify_associative_operation): Don't reassociate
>   more than 64 times within one outermost simplify_* call.
>   * dwarf2out.c (mem_loc_descriptor): Optimize binary operation
>   with both operands the same using DW_OP_dup.
> 
>   * gcc.dg/pr102356.c: New test.
> 
> --- gcc/rtl.h.jj  2021-11-02 09:06:05.904396581 +0100
> +++ gcc/rtl.h 2021-11-30 14:55:39.701257736 +0100
> @@ -3433,6 +3433,10 @@ public:
>   inside a MEM than outside.  */
>unsigned int mem_depth = 0;
>  
> +  /* Tracks number of simplify_associative_operation calls performed during
> + outermost simplify* call.  */
> +  unsigned int assoc_count = 0;
> +
>  private:
>rtx simplify_truncation (machine_mode, rtx, machine_mode);
>rtx simplify_byte_swapping_operation (rtx_code, machine_mode, rtx, rtx);
> --- gcc/simplify-rtx.c.jj 2021-11-30 09:44:46.619606170 +0100
> +++ gcc/simplify-rtx.c2021-11-30 14:59:00.251321577 +0100
> @@ -2263,6 +2263,16 @@ simplify_context::simplify_associative_o
>  {
>rtx tem;
>  
> +  /* Normally expressions simplified by simplify-rtx.c are combined
> + at most from a few machine instructions and therefore the
> + expressions should be fairly small.  During var-tracking
> + we can see arbitrarily large expressions though and reassociating
> + those can be quadratic, so punt after encountering 64
> + simplify_associative_operation calls during outermost simplify_*
> + call.  */
> +  if (++assoc_count >= 64)
> +return NULL_RTX;
> +
>/* Linearize the operator to the left.  */
>if (GET_CODE (op1) == code)
>  {
> --- gcc/dwarf2out.c.jj2021-11-30 09:44:46.568606908 +0100
> +++ gcc/dwarf2out.c   2021-11-30 14:53:28.779174490 +0100
> @@ -16363,6 +16363,15 @@ mem_loc_descriptor (rtx rtl, machine_mod
>  do_binop:
>op0 = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode,
>   VAR_INIT_STATUS_INITIALIZED);
> +  if (XEXP (rtl, 0) == XEXP (rtl, 1))
> + {
> +   if (op0 == 0)
> + break;
> +   mem_loc_result = op0;
> +   add_loc_descr (&mem_loc_result, new_loc_descr (DW_OP_dup, 0, 0));
> +   add_loc_descr (&mem_loc_result, new_loc_descr (op, 0, 0));
> +   break;
> + }
>op1 = mem_loc_descriptor (XEXP (rtl, 1), mode, mem_mode,
>   VAR_INIT_STATUS_INITIALIZED);
>  
> --- gcc/testsuite/gcc.dg/pr102356.c.jj2021-11-30 14:53:28.779174490 
> +0100
> +++ gcc/testsuite/gcc.dg/pr102356.c   2021-11-30 14:53:28.779174490 +0100
> @@ -0,0 +1,33 @@
> +/* PR rtl-optimization/102356 */
> +/* { dg-do compile { target int32plus } } */
> +/* { dg-options "-O3 -g" } */
> +
> +signed char a = 0;
> +unsigned char b = 9;
> +unsigned long long c = 0xF1FBFC17225F7A57ULL;
> +int d = 0x3A6667C6;
> +
> +unsigned char
> +foo (unsigned int x)
> +{
> +  unsigned int *e = &x;
> +  if ((c /= ((0 * (*e *= b)) <= 0)))
> +;
> +  for (d = 9; d > 2; d -= 2)
> +{
> +  c = -2;
> +  do
> + if ((*e *= *e))
> +   {
> + a = 4;
> + do
> +   {
> + a -= 3;
> + if ((*e *= *e))
> +   b = 9;
> +   }
> + while (a > 2);
> +   }
> +  while (c++);
> +}
> +}
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

[PATCH] libcpp, v2: Fix up #__VA_OPT__ handling [PR103415]

On Mon, Nov 29, 2021 at 07:28:10PM -0500, Jason Merrill wrote:
> Please add some of this explanation to the "paste any tokens" comment in the
> code.

Ok.

> > + while (rhs->flags & PASTE_LEFT);
> > + if ((flags & PREV_WHITE)
> > + && (token->flags & PREV_WHITE) == 0)
> > +   const_cast(token)->flags
> > + |= PREV_WHITE;
> 
> Hmm, shouldn't paste_tokens handle copying PREV_WHITE?

Copying there PREV_FALLTHROUGH fixes the new Wimplicit-fallthrough-38.c
testcase, I couldn't find where doing the copying of PREV_WHITE would
make an observable difference outside of __VA_OPT__, e.g.
#define F(x) #x
#define G(x) F(x)
#define H G({a##b)
#define I G({ a##b)
const char *h = H;
const char *i = I;
results in "{ab" and "{ ab" before/after the patch.  But copying it
in paste_tokens looks cleaner...

2021-11-30  Jakub Jelinek  

PR preprocessor/103415
libcpp/
* macro.c (stringify_arg): Remove va_opt argument and va_opt handling.
(paste_tokens): On successful paste or in PREV_WHITE and
PREV_FALLTHROUGH flags from the *plhs token to the new token.
(replace_args): Adjust stringify_arg callers.  For #__VA_OPT__,
perform token pasting in a separate loop before stringify_arg call.
gcc/testsuite/
* c-c++-common/cpp/va-opt-8.c: New test.
* c-c++-common/Wimplicit-fallthrough-38.c: New test.

--- libcpp/macro.c.jj   2021-11-26 10:09:50.278020239 +0100
+++ libcpp/macro.c  2021-11-30 14:05:25.274132482 +0100
@@ -295,7 +295,7 @@ static cpp_context *next_context (cpp_re
 static const cpp_token *padding_token (cpp_reader *, const cpp_token *);
 static const cpp_token *new_string_token (cpp_reader *, uchar *, unsigned int);
 static const cpp_token *stringify_arg (cpp_reader *, const cpp_token **,
-  unsigned int, bool);
+  unsigned int);
 static void paste_all_tokens (cpp_reader *, const cpp_token *);
 static bool paste_tokens (cpp_reader *, location_t,
  const cpp_token **, const cpp_token *);
@@ -834,8 +834,7 @@ cpp_quote_string (uchar *dest, const uch
 /* Convert a token sequence FIRST to FIRST+COUNT-1 to a single string token
according to the rules of the ISO C #-operator.  */
 static const cpp_token *
-stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count,
-  bool va_opt)
+stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count)
 {
   unsigned char *dest;
   unsigned int i, escape_it, backslash_count = 0;
@@ -852,24 +851,6 @@ stringify_arg (cpp_reader *pfile, const
 {
   const cpp_token *token = first[i];
 
-  if (va_opt && (token->flags & PASTE_LEFT))
-   {
- location_t virt_loc = pfile->invocation_location;
- const cpp_token *rhs;
- do
-   {
- if (i == count)
-   abort ();
- rhs = first[++i];
- if (!paste_tokens (pfile, virt_loc, &token, rhs))
-   {
- --i;
- break;
-   }
-   }
- while (rhs->flags & PASTE_LEFT);
-   }
-
   if (token->type == CPP_PADDING)
{
  if (source == NULL
@@ -1003,6 +984,7 @@ paste_tokens (cpp_reader *pfile, locatio
   return false;
 }
 
+  lhs->flags |= (*plhs)->flags & (PREV_WHITE | PREV_FALLTHROUGH);
   *plhs = lhs;
   _cpp_pop_buffer (pfile);
   return true;
@@ -1945,8 +1927,7 @@ replace_args (cpp_reader *pfile, cpp_has
if (src->flags & STRINGIFY_ARG)
  {
if (!arg->stringified)
- arg->stringified = stringify_arg (pfile, arg->first, arg->count,
-   false);
+ arg->stringified = stringify_arg (pfile, arg->first, arg->count);
  }
else if ((src->flags & PASTE_LEFT)
 || (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT)))
@@ -2066,11 +2047,46 @@ replace_args (cpp_reader *pfile, cpp_has
{
  unsigned int count
= start ? paste_flag - start : tokens_buff_count (buff);
- const cpp_token *t
-   = stringify_arg (pfile,
-start ? start + 1
-: (const cpp_token **) (buff->base),
-count, true);
+ const cpp_token **first
+   = start ? start + 1
+   : (const cpp_token **) (buff->base);
+ unsigned int i, j;
+
+ /* Paste any tokens that need to be pasted before calling
+stringify_arg, because stringify_arg uses pfile->u_buff
+which paste_tokens can use as well.  */
+ for (i = 0, j = 0; i < count; i++, j++)
+   {
+

Re: [PATCH] Avoid some -Wunreachable-code-ctrl


On 30/11/2021 14:25, Richard Biener wrote:

On Tue, 30 Nov 2021, Mikael Morin wrote:


Le 29/11/2021 à 16:03, Richard Biener via Gcc-patches a écrit :

diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index f5ba7cecd54..16ee2afc9c0 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn,
void *data)
  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
-   break;
  case EXPR_FUNCTION:
for (a = (*e)->value.function.actual; a; a = a->next)
  WALK_SUBEXPR (a->expr);


I’m uncomfortable with the above change.
It makes it look like there is a fall through, but there is not.
Maybe inline the macro to make the continue explicit, or use WALK_SUBEXPR
instead of WALK_SUBEXPR_TAIL and hope the compiler will do the tail call
optimization.


Ah, it follows the style in tree.c:walk_tree_1 where break was used
inconsistently after WALK_SUBTREE_TAIL which was then more obvious
to me to clean up.  I didn't realize the fortran FE only had a
single WALK_SUBEXPR_TAIL.

I'm not sure inlining will make the situation more clear, for
sure using WALK_SUBEXPR would but it might loose the tailcall.

Would you accept an additional comment after WALK_SUBEXPR_TAIL like

   case EXPR_OP:
 WALK_SUBEXPR ((*e)->value.op.op1);
 WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
 /* tail-recurse  */

My preference would be a gcc_unreachable() or something similar, but I 
understand it would get a warning as well?


Without better idea, I’m fine with an even more explicit comment:

/* No fallthru because of the tail recursion above.  */


?  Btw, a fallthru would be diagnosed by GCC unless we put

 /* Fallthru  */

here.
Sure, but my main concern was misreading from programmers (including 
me), which is not diagnosed by compilers.



 Maybe renaming WALK_SUBEXPR_TAIL to WALK_SUBEXPR_WITH_CONTINUE
or WALK_SUBEXPR_BY_TAIL_RECURSING or WALK_SUBEXPR_TAILRECURSE would
be more obvious?


I think the comment above would be enough.

Thanks.

[PATCH] libsanitizer: Use SSE to save and restore XMM registers

2021-11-30 Thread H.J. Lu via Gcc-patches

Use SSE, instead of AVX, to save and restore XMM registers to support
processors without AVX.  The affected codes are unused in upstream since

https://github.com/llvm/llvm-project/commit/66d4ce7e26a5

and will be removed in

https://reviews.llvm.org/D112604

This fixed

FAIL: g++.dg/tsan/pthread_cond_clockwait.C   -O0  execution test
FAIL: g++.dg/tsan/pthread_cond_clockwait.C   -O2  execution test

on machines without AVX.

PR sanitizer/103466
* tsan/tsan_rtl_amd64.S (__tsan_trace_switch_thunk): Replace
vmovdqu with movdqu.
(__tsan_report_race_thunk): Likewise.
---
 libsanitizer/tsan/tsan_rtl_amd64.S | 128 ++---
 1 file changed, 64 insertions(+), 64 deletions(-)

diff --git a/libsanitizer/tsan/tsan_rtl_amd64.S 
b/libsanitizer/tsan/tsan_rtl_amd64.S
index 632b19d1815..c15b01e49e5 100644
--- a/libsanitizer/tsan/tsan_rtl_amd64.S
+++ b/libsanitizer/tsan/tsan_rtl_amd64.S
@@ -45,22 +45,22 @@ ASM_SYMBOL(__tsan_trace_switch_thunk):
   # All XMM registers are caller-saved.
   sub $0x100, %rsp
   CFI_ADJUST_CFA_OFFSET(0x100)
-  vmovdqu %xmm0, 0x0(%rsp)
-  vmovdqu %xmm1, 0x10(%rsp)
-  vmovdqu %xmm2, 0x20(%rsp)
-  vmovdqu %xmm3, 0x30(%rsp)
-  vmovdqu %xmm4, 0x40(%rsp)
-  vmovdqu %xmm5, 0x50(%rsp)
-  vmovdqu %xmm6, 0x60(%rsp)
-  vmovdqu %xmm7, 0x70(%rsp)
-  vmovdqu %xmm8, 0x80(%rsp)
-  vmovdqu %xmm9, 0x90(%rsp)
-  vmovdqu %xmm10, 0xa0(%rsp)
-  vmovdqu %xmm11, 0xb0(%rsp)
-  vmovdqu %xmm12, 0xc0(%rsp)
-  vmovdqu %xmm13, 0xd0(%rsp)
-  vmovdqu %xmm14, 0xe0(%rsp)
-  vmovdqu %xmm15, 0xf0(%rsp)
+  movdqu %xmm0, 0x0(%rsp)
+  movdqu %xmm1, 0x10(%rsp)
+  movdqu %xmm2, 0x20(%rsp)
+  movdqu %xmm3, 0x30(%rsp)
+  movdqu %xmm4, 0x40(%rsp)
+  movdqu %xmm5, 0x50(%rsp)
+  movdqu %xmm6, 0x60(%rsp)
+  movdqu %xmm7, 0x70(%rsp)
+  movdqu %xmm8, 0x80(%rsp)
+  movdqu %xmm9, 0x90(%rsp)
+  movdqu %xmm10, 0xa0(%rsp)
+  movdqu %xmm11, 0xb0(%rsp)
+  movdqu %xmm12, 0xc0(%rsp)
+  movdqu %xmm13, 0xd0(%rsp)
+  movdqu %xmm14, 0xe0(%rsp)
+  movdqu %xmm15, 0xf0(%rsp)
   # Align stack frame.
   push %rbx  # non-scratch
   CFI_ADJUST_CFA_OFFSET(8)
@@ -78,22 +78,22 @@ ASM_SYMBOL(__tsan_trace_switch_thunk):
   pop %rbx
   CFI_ADJUST_CFA_OFFSET(-8)
   # Restore scratch registers.
-  vmovdqu 0x0(%rsp), %xmm0
-  vmovdqu 0x10(%rsp), %xmm1
-  vmovdqu 0x20(%rsp), %xmm2
-  vmovdqu 0x30(%rsp), %xmm3
-  vmovdqu 0x40(%rsp), %xmm4
-  vmovdqu 0x50(%rsp), %xmm5
-  vmovdqu 0x60(%rsp), %xmm6
-  vmovdqu 0x70(%rsp), %xmm7
-  vmovdqu 0x80(%rsp), %xmm8
-  vmovdqu 0x90(%rsp), %xmm9
-  vmovdqu 0xa0(%rsp), %xmm10
-  vmovdqu 0xb0(%rsp), %xmm11
-  vmovdqu 0xc0(%rsp), %xmm12
-  vmovdqu 0xd0(%rsp), %xmm13
-  vmovdqu 0xe0(%rsp), %xmm14
-  vmovdqu 0xf0(%rsp), %xmm15
+  movdqu 0x0(%rsp), %xmm0
+  movdqu 0x10(%rsp), %xmm1
+  movdqu 0x20(%rsp), %xmm2
+  movdqu 0x30(%rsp), %xmm3
+  movdqu 0x40(%rsp), %xmm4
+  movdqu 0x50(%rsp), %xmm5
+  movdqu 0x60(%rsp), %xmm6
+  movdqu 0x70(%rsp), %xmm7
+  movdqu 0x80(%rsp), %xmm8
+  movdqu 0x90(%rsp), %xmm9
+  movdqu 0xa0(%rsp), %xmm10
+  movdqu 0xb0(%rsp), %xmm11
+  movdqu 0xc0(%rsp), %xmm12
+  movdqu 0xd0(%rsp), %xmm13
+  movdqu 0xe0(%rsp), %xmm14
+  movdqu 0xf0(%rsp), %xmm15
   add $0x100, %rsp
   CFI_ADJUST_CFA_OFFSET(-0x100)
   pop %r11
@@ -163,22 +163,22 @@ ASM_SYMBOL(__tsan_report_race_thunk):
   # All XMM registers are caller-saved.
   sub $0x100, %rsp
   CFI_ADJUST_CFA_OFFSET(0x100)
-  vmovdqu %xmm0, 0x0(%rsp)
-  vmovdqu %xmm1, 0x10(%rsp)
-  vmovdqu %xmm2, 0x20(%rsp)
-  vmovdqu %xmm3, 0x30(%rsp)
-  vmovdqu %xmm4, 0x40(%rsp)
-  vmovdqu %xmm5, 0x50(%rsp)
-  vmovdqu %xmm6, 0x60(%rsp)
-  vmovdqu %xmm7, 0x70(%rsp)
-  vmovdqu %xmm8, 0x80(%rsp)
-  vmovdqu %xmm9, 0x90(%rsp)
-  vmovdqu %xmm10, 0xa0(%rsp)
-  vmovdqu %xmm11, 0xb0(%rsp)
-  vmovdqu %xmm12, 0xc0(%rsp)
-  vmovdqu %xmm13, 0xd0(%rsp)
-  vmovdqu %xmm14, 0xe0(%rsp)
-  vmovdqu %xmm15, 0xf0(%rsp)
+  movdqu %xmm0, 0x0(%rsp)
+  movdqu %xmm1, 0x10(%rsp)
+  movdqu %xmm2, 0x20(%rsp)
+  movdqu %xmm3, 0x30(%rsp)
+  movdqu %xmm4, 0x40(%rsp)
+  movdqu %xmm5, 0x50(%rsp)
+  movdqu %xmm6, 0x60(%rsp)
+  movdqu %xmm7, 0x70(%rsp)
+  movdqu %xmm8, 0x80(%rsp)
+  movdqu %xmm9, 0x90(%rsp)
+  movdqu %xmm10, 0xa0(%rsp)
+  movdqu %xmm11, 0xb0(%rsp)
+  movdqu %xmm12, 0xc0(%rsp)
+  movdqu %xmm13, 0xd0(%rsp)
+  movdqu %xmm14, 0xe0(%rsp)
+  movdqu %xmm15, 0xf0(%rsp)
   # Align stack frame.
   push %rbx  # non-scratch
   CFI_ADJUST_CFA_OFFSET(8)
@@ -196,22 +196,22 @@ ASM_SYMBOL(__tsan_report_race_thunk):
   pop %rbx
   CFI_ADJUST_CFA_OFFSET(-8)
   # Restore scratch registers.
-  vmovdqu 0x0(%rsp), %xmm0
-  vmovdqu 0x10(%rsp), %xmm1
-  vmovdqu 0x20(%rsp), %xmm2
-  vmovdqu 0x30(%rsp), %xmm3
-  vmovdqu 0x40(%rsp), %xmm4
-  vmovdqu 0x50(%rsp), %xmm5
-  vmovdqu 0x60(%rsp), %xmm6
-  vmovdqu 0x70(%rsp), %xmm7
-  vmovdqu 0x80(%rsp), %xmm8
-  vmovdqu 0x90(%rsp), %xmm9
-  vmovdqu 0xa0(%rsp), %xmm10
-  vmovdqu 0xb0(%rsp), %xmm11
-  vmovdqu 0xc0(%rsp), %xmm12
-  vmovdqu 0xd0(%rsp), %xmm13
-  vmovdqu 0xe0(%rsp), %xmm14
-  vmovdqu 0xf0(%rsp), %xmm15
+  movdqu 0x0(%rsp), %xmm0
+  movd

[PATCH] ipa-sra: Check also ECF_LOOPING_CONST_OR_PURE when evaluating calls

2021-11-30 Thread Martin Jambor

Hi,

in PR 103267 Honza found out that IPA-SRA does not look at
ECF_LOOPING_CONST_OR_PURE when evaluating if a call can have side
effects.  Fixed with this patch.  The testcase infinitely loops in a
const function, so it would not make a good addition to the testsuite.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


gcc/ChangeLog:

2021-11-29  Martin Jambor  

PT ipa/103267
* ipa-sra.c (scan_function): Also check ECF_LOOPING_CONST_OR_PURE flag.
---
 gcc/ipa-sra.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index cb0e30507a1..12ccd049552 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -1925,7 +1925,8 @@ scan_function (cgraph_node *node, struct function *fun)
if (lhs)
  scan_expr_access (lhs, stmt, ISRA_CTX_STORE, bb);
int flags = gimple_call_flags (stmt);
-   if ((flags & (ECF_CONST | ECF_PURE)) == 0)
+   if (((flags & (ECF_CONST | ECF_PURE)) == 0)
+   || (flags & ECF_LOOPING_CONST_OR_PURE))
  bitmap_set_bit (final_bbs, bb->index);
  }
  break;
-- 
2.33.1

Re: [PATCH] Avoid some -Wunreachable-code-ctrl

On Tue, 30 Nov 2021, Mikael Morin wrote:

> On 30/11/2021 14:25, Richard Biener wrote:
> > On Tue, 30 Nov 2021, Mikael Morin wrote:
> > 
> >> Le 29/11/2021 ? 16:03, Richard Biener via Gcc-patches a ?crit :
> >>> diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
> >>> index f5ba7cecd54..16ee2afc9c0 100644
> >>> --- a/gcc/fortran/frontend-passes.c
> >>> +++ b/gcc/fortran/frontend-passes.c
> >>> @@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t
> >>> exprfn,
> >>> void *data)
> >>>   case EXPR_OP:
> >>> WALK_SUBEXPR ((*e)->value.op.op1);
> >>> WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
> >>> - break;
> >>>   case EXPR_FUNCTION:
> >>> for (a = (*e)->value.function.actual; a; a = a->next)
> >>>   WALK_SUBEXPR (a->expr);
> >>
> >> I?m uncomfortable with the above change.
> >> It makes it look like there is a fall through, but there is not.
> >> Maybe inline the macro to make the continue explicit, or use WALK_SUBEXPR
> >> instead of WALK_SUBEXPR_TAIL and hope the compiler will do the tail call
> >> optimization.
> > 
> > Ah, it follows the style in tree.c:walk_tree_1 where break was used
> > inconsistently after WALK_SUBTREE_TAIL which was then more obvious
> > to me to clean up.  I didn't realize the fortran FE only had a
> > single WALK_SUBEXPR_TAIL.
> > 
> > I'm not sure inlining will make the situation more clear, for
> > sure using WALK_SUBEXPR would but it might loose the tailcall.
> > 
> > Would you accept an additional comment after WALK_SUBEXPR_TAIL like
> > 
> >case EXPR_OP:
> >  WALK_SUBEXPR ((*e)->value.op.op1);
> >  WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
> >  /* tail-recurse  */
> > 
> My preference would be a gcc_unreachable() or something similar, but I
> understand it would get a warning as well?
> 
> Without better idea, I?m fine with an even more explicit comment:
> 
> /* No fallthru because of the tail recursion above.  */
> 
> > ?  Btw, a fallthru would be diagnosed by GCC unless we put
> > 
> >  /* Fallthru  */
> > 
> > here.
> Sure, but my main concern was misreading from programmers (including me),
> which is not diagnosed by compilers.
> 
> > Maybe renaming WALK_SUBEXPR_TAIL to WALK_SUBEXPR_WITH_CONTINUE
> > or WALK_SUBEXPR_BY_TAIL_RECURSING or WALK_SUBEXPR_TAILRECURSE would
> > be more obvious?
> > 
> I think the comment above would be enough.

Installed as follows.

Richard.

>From e5c2a436ef7596d254ffefd279742382b1ff546b Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Tue, 30 Nov 2021 15:25:17 +0100
Subject: [PATCH] Add comment to indicate tail recursion
To: gcc-patches@gcc.gnu.org

My previous change removed an unreachable break; there (an
unreachable continue; would have been more to the point).  The
following re-adds a comment explaining that WALK_SUBEXPR_TAIL
does not fall through but tail recurses.

2021-11-30  Richard Biener  

gcc/fortran/
* frontend-passes.c (gfc_expr_walker): Add comment to
indicate tail recursion.
---
 gcc/fortran/frontend-passes.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index 16ee2afc9c0..4764c834f4f 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5229,6 +5229,7 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn, 
void *data)
  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
+   /* No fallthru because of the tail recursion above.  */
  case EXPR_FUNCTION:
for (a = (*e)->value.function.actual; a; a = a->next)
  WALK_SUBEXPR (a->expr);
-- 
2.31.1

Re: [PATCH] ipa-sra: Check also ECF_LOOPING_CONST_OR_PURE when evaluating calls

On Tue, Nov 30, 2021 at 3:24 PM Martin Jambor  wrote:
>
> Hi,
>
> in PR 103267 Honza found out that IPA-SRA does not look at
> ECF_LOOPING_CONST_OR_PURE when evaluating if a call can have side
> effects.  Fixed with this patch.  The testcase infinitely loops in a
> const function, so it would not make a good addition to the testsuite.
>
> Bootstrapped and tested on x86_64-linux.  OK for trunk?

OK.

> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
> 2021-11-29  Martin Jambor  
>
> PT ipa/103267
> * ipa-sra.c (scan_function): Also check ECF_LOOPING_CONST_OR_PURE 
> flag.
> ---
>  gcc/ipa-sra.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
> index cb0e30507a1..12ccd049552 100644
> --- a/gcc/ipa-sra.c
> +++ b/gcc/ipa-sra.c
> @@ -1925,7 +1925,8 @@ scan_function (cgraph_node *node, struct function *fun)
> if (lhs)
>   scan_expr_access (lhs, stmt, ISRA_CTX_STORE, bb);
> int flags = gimple_call_flags (stmt);
> -   if ((flags & (ECF_CONST | ECF_PURE)) == 0)
> +   if (((flags & (ECF_CONST | ECF_PURE)) == 0)
> +   || (flags & ECF_LOOPING_CONST_OR_PURE))
>   bitmap_set_bit (final_bbs, bb->index);
>   }
>   break;
> --
> 2.33.1
>

[PATCH] tree-optimization/103464 - Also pre-process PHIs in range-of-stmt.

2021-11-30 Thread Andrew MacLeod via Gcc-patches

When I flatten the call stack for range_of_stmt in PR 103231 ( 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103231 ), I mention that I 
was only flattening it for chains of statements with range handlers. If 
it turned out that PHI chaining was also a problem, we could also do PHIs.


The cost to do phis is quite nominal, and resolve this testcase...  so 
we might as well do PHIs as well.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK?

Andrew
commit 99b0f5f03a04fd342461a67287d81250f86f0586
Author: Andrew MacLeod 
Date:   Mon Nov 29 12:00:26 2021 -0500

Also pre-process PHIs in range-of-stmt.

PR tree-optimization/103464
* gimple-range.cc (gimple_ranger::prefill_name): Process phis also.
(gimple_ranger::prefill_stmt_dependencies): Ditto.

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 178a470a419..c8431a7180b 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -333,7 +333,7 @@ gimple_ranger::prefill_name (irange &r, tree name)
   if (!gimple_range_ssa_p (name))
 return;
   gimple *stmt = SSA_NAME_DEF_STMT (name);
-  if (!gimple_range_handler (stmt))
+  if (!gimple_range_handler (stmt) && !is_a (stmt))
 return;
 
   bool current;
@@ -356,8 +356,8 @@ gimple_ranger::prefill_stmt_dependencies (tree ssa)
   gimple *stmt = SSA_NAME_DEF_STMT (ssa);
   gcc_checking_assert (stmt && gimple_bb (stmt));
 
-  // Only pre-process range-ops.
-  if (!gimple_range_handler (stmt))
+  // Only pre-process range-ops and phis.
+  if (!gimple_range_handler (stmt) && !is_a (stmt))
 return;
 
   // Mark where on the stack we are starting.
@@ -401,13 +401,22 @@ gimple_ranger::prefill_stmt_dependencies (tree ssa)
 	  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
 	}
 
-  gcc_checking_assert (gimple_range_handler (stmt));
-  tree op = gimple_range_operand2 (stmt);
-  if (op)
-	prefill_name (r, op);
-  op = gimple_range_operand1 (stmt);
-  if (op)
-	prefill_name (r, op);
+  gphi *phi = dyn_cast  (stmt);
+  if (phi)
+	{
+	  for (unsigned x = 0; x < gimple_phi_num_args (phi); x++)
+	prefill_name (r, gimple_phi_arg_def (phi, x));
+	}
+  else
+	{
+	  gcc_checking_assert (gimple_range_handler (stmt));
+	  tree op = gimple_range_operand2 (stmt);
+	  if (op)
+	prefill_name (r, op);
+	  op = gimple_range_operand1 (stmt);
+	  if (op)
+	prefill_name (r, op);
+	}
 }
   if (idx)
 tracer.trailer (idx, "ROS ", false, ssa, r);

[committed 10/19] libphobos: Update libgdruntime to build with latest version

Updates the make files, and the gdc-specific modules of druntime.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* libdruntime/Makefile.am (D_EXTRA_FLAGS): Build libdruntime with
-fpreview=dip1000, -fpreview=fieldwise, and -fpreview=dtorfields.
(ALL_DRUNTIME_SOURCES): Add DRUNTIME_DSOURCES_STDCXX.
(DRUNTIME_DSOURCES): Update list of C binding modules.
(DRUNTIME_DSOURCES_STDCXX): Likewise.
(DRUNTIME_DSOURCES_LINUX): Likewise.
(DRUNTIME_DSOURCES_OPENBSD): Likewise.
(DRUNTIME_DISOURCES): Remove __entrypoint.di.
* libdruntime/Makefile.in: Regenerated.
* libdruntime/__entrypoint.di: Removed.
* libdruntime/gcc/backtrace.d (FIRSTFRAME): Remove.
(LibBacktrace.MaxAlignment): Remove.
(LibBacktrace.this): Remove default initialization of firstFrame.
(UnwindBacktrace.this): Likewise.
* libdruntime/gcc/deh.d (_d_isbaseof): Update signature.
(_d_createTrace): Likewise.
(__gdc_begin_catch): Remove reference to the exception.
(_d_throw): Increment reference count of thrown object before unwind.
(__gdc_personality): Chain exceptions with  Throwable.chainTogether.
* libdruntime/gcc/emutls.d: Update imports.
* libdruntime/gcc/sections/elf.d: Update imports.
(DSO.moduleGroup): Update signature.
* libdruntime/gcc/sections/macho.d: Update imports.
(DSO.moduleGroup): Update signature.
* libdruntime/gcc/sections/pecoff.d: Update imports.
(DSO.moduleGroup): Update signature.
* libdruntime/gcc/unwind/generic.d (__aligned__): Define.
---
 libphobos/libdruntime/Makefile.am   |   6 +-
 libphobos/libdruntime/Makefile.in   | 148 
 libphobos/libdruntime/__entrypoint.di   |  56 
 libphobos/libdruntime/gcc/deh.d |  22 +--
 libphobos/libdruntime/gcc/emutls.d  |   3 +-
 libphobos/libdruntime/gcc/sections/elf.d|   6 +-
 libphobos/libdruntime/gcc/sections/macho.d  |   6 +-
 libphobos/libdruntime/gcc/sections/pecoff.d |   6 +-
 8 files changed, 116 insertions(+), 137 deletions(-)
 delete mode 100644 libphobos/libdruntime/__entrypoint.di

diff --git a/libphobos/libdruntime/Makefile.am 
b/libphobos/libdruntime/Makefile.am
index 80fc0badcff..80c7567079a 100644
--- a/libphobos/libdruntime/Makefile.am
+++ b/libphobos/libdruntime/Makefile.am
@@ -19,7 +19,8 @@
 include $(top_srcdir)/d_rules.am
 
 # Make sure GDC can find libdruntime include files
-D_EXTRA_DFLAGS=-nostdinc -I $(srcdir) -I .
+D_EXTRA_DFLAGS=-fpreview=dip1000 -fpreview=fieldwise -fpreview=dtorfields \
+  -nostdinc -I $(srcdir) -I .
 
 # D flags for compilation
 AM_DFLAGS= \
@@ -119,6 +120,7 @@ endif
 DRUNTIME_DSOURCES_GENERATED = gcc/config.d gcc/libbacktrace.d
 
 ALL_DRUNTIME_SOURCES = $(DRUNTIME_DSOURCES) $(DRUNTIME_CSOURCES) \
+   $(DRUNTIME_DSOURCES_STDCXX) \
$(DRUNTIME_SOURCES_CONFIGURED) $(DRUNTIME_DSOURCES_GENERATED)
 
 # Need this library to both be part of libgphobos.a, and installed separately.
@@ -422,4 +424,4 @@ DRUNTIME_DSOURCES_WINDOWS = core/sys/windows/accctrl.d \
core/sys/windows/winuser.d core/sys/windows/winver.d \
core/sys/windows/wtsapi32.d core/sys/windows/wtypes.d
 
-DRUNTIME_DISOURCES = __entrypoint.di __main.di
+DRUNTIME_DISOURCES = __main.di
diff --git a/libphobos/libdruntime/Makefile.in 
b/libphobos/libdruntime/Makefile.in
index cdb1fe3cc18..b5f29da8540 100644
--- a/libphobos/libdruntime/Makefile.in
+++ b/libphobos/libdruntime/Makefile.in
@@ -245,7 +245,13 @@ am__objects_1 = core/atomic.lo core/attribute.lo 
core/bitop.lo \
rt/monitor_.lo rt/profilegc.lo rt/sections.lo rt/tlsgc.lo \
rt/util/typeinfo.lo rt/util/utility.lo
 am__objects_2 = core/stdc/libgdruntime_la-errno_.lo
-am__objects_3 = core/sys/posix/aio.lo core/sys/posix/arpa/inet.lo \
+am__objects_3 = core/stdcpp/allocator.lo core/stdcpp/array.lo \
+   core/stdcpp/exception.lo core/stdcpp/memory.lo \
+   core/stdcpp/new_.lo core/stdcpp/string.lo \
+   core/stdcpp/string_view.lo core/stdcpp/type_traits.lo \
+   core/stdcpp/typeinfo.lo core/stdcpp/utility.lo \
+   core/stdcpp/vector.lo core/stdcpp/xutility.lo
+am__objects_4 = core/sys/posix/aio.lo core/sys/posix/arpa/inet.lo \
core/sys/posix/config.lo core/sys/posix/dirent.lo \
core/sys/posix/dlfcn.lo core/sys/posix/fcntl.lo \
core/sys/posix/grp.lo core/sys/posix/iconv.lo \
@@ -272,8 +278,8 @@ am__objects_3 = core/sys/posix/aio.lo 
core/sys/posix/arpa/inet.lo \
core/sys/posix/syslog.lo core/sys/posix/termios.lo \
core/sys/posix/time.lo core/sys/posix/ucontext.lo \
core/sys/posix/unistd.lo core/sys/posix/utime.lo
-@DRUNTIME_OS_POSIX_TRUE@am__objects_4 = $(am__objects_3)
-am__objects_5 = core/sys/darwin/config.lo \
+@DRUNTIME_OS_POSIX_TRUE@am__objects_5 = $(am__objects_4)
+am__objects_6 = core/sys/d

Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-30 Thread Stephan Bergmann via Gcc-patches


On 30/11/2021 14:26, Marek Polacek wrote:

On Tue, Nov 30, 2021 at 09:38:57AM +0100, Stephan Bergmann wrote:

On 15/11/2021 18:28, Marek Polacek via Gcc-patches wrote:

On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:

Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
but changing the name is a trivial operation.


Here's a patch with a better name (suggested by Jonathan W.).  Otherwise no
changes.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
  From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."

More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/

This is not a compiler bug.  However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional characters the preprocessor may encounter.

The default is =unpaired, which warns about improperly terminated
bidirectional characters; e.g. a LRE without its appertaining PDF.  The
level =any warns about any use of bidirectional characters.

This patch handles both UCNs and UTF-8 characters.  UCNs designating
bidi characters in identifiers are accepted since r204886.  Then r217144
enabled -fextended-identifiers by default.  Extended characters in C/C++
identifiers have been accepted since r275979.  However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.


I wonder what the rationale is to warn about UCNs, like in


 aText = u"\u202D" + aText;


(as found in the LibreOffice source code).


Is this line mixing a UCN and a UTF-8?  Or is it just that you're
prepending a LRO to aText?  We warn because the LRO is not "closed"
in the context of its string literal, which was part of the Trojan
source attack.  So "\u202D ... \u202C" would not warn.

I'm not sure what workaround I could offer.  Maybe provide an option not to
warn about UCNs at all, though even that is potentially dangerous -- while
you can see UCNs in the source code, if you print strings containing them,
they won't be visible anymore.


I'm not sure what you mean with "mixing a UCN and a UTF-8", but what the 
code apparently does is programmatically constructing a larger piece of 
text by prepending LRO to an existing piece of text.


My understanding is that Trojan Source is concerned with presentation of 
program source code and not with properties of Unicode text constructed 
during the execution of such a program, and from the documentation 
quoted above I understand that -Wbidi-chars is meant to address Trojan 
Source, so I don't understand why you're concerned here with what 
happens "if you print strings containing [UCNs in the source code]".


Short of a source code viewer that interprets UCNs in C/C++ source code 
and renders them in the same way as their corresponding Unicode 
characters, I don't think that UCNs are relevant for Trojan Source, and 
don't understand why -Wbidi-chars would warn about them.


(Also, I noticed that it doesn't work to silence -Werror=bidi-chars= with a


#pragma GCC diagnostic ignored "-Wbidi-chars"


?)

[committed 14/19] libphobos: Update libphobos to build latest version

Updates the make files that build phobos.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* src/Makefile.am (D_EXTRA_DFLAGS): Add -fpreview=dip1000 and
-fpreview=dtorfields flags.
(PHOBOS_DSOURCES): Update list of std modules.
* src/Makefile.in: Regenerate.
---
 libphobos/src/Makefile.am |  47 +++-
 libphobos/src/Makefile.in | 145 ++
 2 files changed, 142 insertions(+), 50 deletions(-)

diff --git a/libphobos/src/Makefile.am b/libphobos/src/Makefile.am
index 9f6251009f6..ba1579da8d7 100644
--- a/libphobos/src/Makefile.am
+++ b/libphobos/src/Makefile.am
@@ -19,7 +19,7 @@
 include $(top_srcdir)/d_rules.am
 
 # Make sure GDC can find libdruntime and libphobos include files
-D_EXTRA_DFLAGS=-nostdinc -I $(srcdir) \
+D_EXTRA_DFLAGS=-fpreview=dip1000 -fpreview=dtorfields -nostdinc -I $(srcdir) \
-I $(top_srcdir)/libdruntime -I ../libdruntime -I .
 
 # D flags for compilation
@@ -83,12 +83,12 @@ PHOBOS_DSOURCES =
 
 else
 
-PHOBOS_DSOURCES = etc/c/curl.d etc/c/sqlite3.d etc/c/zlib.d \
-   std/algorithm/comparison.d std/algorithm/internal.d \
-   std/algorithm/iteration.d std/algorithm/mutation.d \
-   std/algorithm/package.d std/algorithm/searching.d \
-   std/algorithm/setops.d std/algorithm/sorting.d std/array.d std/ascii.d \
-   std/base64.d std/bigint.d std/bitmanip.d std/compiler.d std/complex.d \
+PHOBOS_DSOURCES = etc/c/curl.d etc/c/zlib.d std/algorithm/comparison.d \
+   std/algorithm/internal.d std/algorithm/iteration.d \
+   std/algorithm/mutation.d std/algorithm/package.d \
+   std/algorithm/searching.d std/algorithm/setops.d \
+   std/algorithm/sorting.d std/array.d std/ascii.d std/base64.d \
+   std/bigint.d std/bitmanip.d std/compiler.d std/complex.d \
std/concurrency.d std/container/array.d std/container/binaryheap.d \
std/container/dlist.d std/container/package.d std/container/rbtree.d \
std/container/slist.d std/container/util.d std/conv.d std/csv.d \
@@ -99,7 +99,9 @@ PHOBOS_DSOURCES = etc/c/curl.d etc/c/sqlite3.d etc/c/zlib.d \
std/digest/murmurhash.d std/digest/package.d std/digest/ripemd.d \
std/digest/sha.d std/encoding.d std/exception.d \
std/experimental/allocator/building_blocks/affix_allocator.d \
+   std/experimental/allocator/building_blocks/aligned_block_list.d \
std/experimental/allocator/building_blocks/allocator_list.d \
+   std/experimental/allocator/building_blocks/ascending_page_allocator.d \
std/experimental/allocator/building_blocks/bitmapped_block.d \
std/experimental/allocator/building_blocks/bucketizer.d \
std/experimental/allocator/building_blocks/fallback_allocator.d \
@@ -123,27 +125,34 @@ PHOBOS_DSOURCES = etc/c/curl.d etc/c/sqlite3.d 
etc/c/zlib.d \
std/experimental/logger/core.d std/experimental/logger/filelogger.d \
std/experimental/logger/multilogger.d \
std/experimental/logger/nulllogger.d std/experimental/logger/package.d \
-   std/experimental/typecons.d std/file.d std/format.d std/functional.d \
-   std/getopt.d std/internal/cstring.d std/internal/math/biguintcore.d \
-   std/internal/math/biguintnoasm.d std/internal/math/errorfunction.d \
-   std/internal/math/gammafunction.d std/internal/scopebuffer.d \
+   std/experimental/typecons.d std/file.d std/format/internal/floats.d \
+   std/format/internal/read.d std/format/internal/write.d \
+   std/format/package.d std/format/read.d std/format/spec.d \
+   std/format/write.d std/functional.d std/getopt.d \
+   std/internal/attributes.d std/internal/cstring.d \
+   std/internal/math/biguintcore.d std/internal/math/biguintnoasm.d \
+   std/internal/math/errorfunction.d std/internal/math/gammafunction.d \
+   std/internal/memory.d std/internal/scopebuffer.d \
std/internal/test/dummyrange.d std/internal/test/range.d \
std/internal/test/uda.d std/internal/unicode_comp.d \
std/internal/unicode_decomp.d std/internal/unicode_grapheme.d \
std/internal/unicode_norm.d std/internal/unicode_tables.d \
-   std/internal/windows/advapi32.d std/json.d std/math.d \
+   std/internal/windows/advapi32.d std/json.d std/math/algebraic.d \
+   std/math/constants.d std/math/exponential.d std/math/hardware.d \
+   std/math/operations.d std/math/package.d std/math/remainder.d \
+   std/math/rounding.d std/math/traits.d std/math/trigonometry.d \
std/mathspecial.d std/meta.d std/mmfile.d std/net/curl.d \
-   std/net/isemail.d std/numeric.d std/outbuffer.d std/parallelism.d \
-   std/path.d std/process.d std/random.d std/range/interfaces.d \
-   std/range/package.d std/range/primitives.d \
+   std/net/isemail.d std/numeric.d std/outbuffer.d std/package.d \
+   std/parallelism.d std/path.d std/process.d std/random.d \
+   std/rang

[committed 17/19] libphobos: Import druntime testsuite v2.098.0-beta.1 (e6caaab9)

This is the updated D runtime library testsuite.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* testsuite/libphobos.aa/test_aa.d: Update test.
* testsuite/libphobos.exceptions/unknown_gc.d: Likewise.
* testsuite/libphobos.hash/test_hash.d: Likewise.
* testsuite/libphobos.shared/host.c: Likewise.
* testsuite/libphobos.shared/load.d: Likewise.
* testsuite/libphobos.shared/load_13414.d: Likewise.
* testsuite/libphobos.thread/fiber_guard_page.d: Likewise.
* testsuite/libphobos.thread/tlsgc_sections.d: Likewise.
* testsuite/libphobos.shared/link_mod_collision.d: Removed.
* testsuite/libphobos.shared/load_mod_collision.d: Removed.
* testsuite/libphobos.allocations/alloc_from_assert.d: New test.
* testsuite/libphobos.betterc/test18828.d: New test.
* testsuite/libphobos.betterc/test19416.d: New test.
* testsuite/libphobos.betterc/test19421.d: New test.
* testsuite/libphobos.betterc/test19561.d: New test.
* testsuite/libphobos.betterc/test19924.d: New test.
* testsuite/libphobos.betterc/test20088.d: New test.
* testsuite/libphobos.betterc/test20613.d: New test.
* testsuite/libphobos.config/test19433.d: New test.
* testsuite/libphobos.config/test20459.d: New test.
* testsuite/libphobos.exceptions/assert_fail.d: New test.
* testsuite/libphobos.exceptions/catch_in_finally.d: New test.
* testsuite/libphobos.exceptions/future_message.d: New test.
* testsuite/libphobos.exceptions/long_backtrace_trunc.d: New test.
* testsuite/libphobos.exceptions/refcounted.d: New test.
* testsuite/libphobos.exceptions/rt_trap_exceptions.d: New test.
* testsuite/libphobos.exceptions/rt_trap_exceptions_drt.d: New test.
* testsuite/libphobos.gc/attributes.d: New test.
* testsuite/libphobos.gc/forkgc.d: New test.
* testsuite/libphobos.gc/forkgc2.d: New test.
* testsuite/libphobos.gc/nocollect.d: New test.
* testsuite/libphobos.gc/precisegc.d: New test.
* testsuite/libphobos.gc/recoverfree.d: New test.
* testsuite/libphobos.gc/sigmaskgc.d: New test.
* testsuite/libphobos.gc/startbackgc.d: New test.
* testsuite/libphobos.imports/bug18193.d: New test.
* testsuite/libphobos.init_fini/custom_gc.d: New test.
* testsuite/libphobos.init_fini/test18996.d: New test.
* testsuite/libphobos.lifetime/large_aggregate_destroy_21097.d: New 
test.
* testsuite/libphobos.thread/external_threads.d: New test.
* testsuite/libphobos.thread/join_detach.d: New test.
* testsuite/libphobos.thread/test_import.d: New test.
* testsuite/libphobos.thread/tlsstack.d: New test.
* testsuite/libphobos.typeinfo/enum_.d: New test.
* testsuite/libphobos.typeinfo/isbaseof.d: New test.
* testsuite/libphobos.unittest/customhandler.d: New test.
---
 libphobos/testsuite/libphobos.aa/test_aa.d|  79 ++-
 .../libphobos.allocations/alloc_from_assert.d |  25 +
 .../testsuite/libphobos.betterc/test18828.d   |  10 +
 .../testsuite/libphobos.betterc/test19416.d   |  14 +
 .../testsuite/libphobos.betterc/test19421.d   |  13 +
 .../testsuite/libphobos.betterc/test19561.d   |  16 +
 .../testsuite/libphobos.betterc/test19924.d   |  15 +
 .../testsuite/libphobos.betterc/test20088.d   |  14 +
 .../testsuite/libphobos.betterc/test20613.d   |  18 +
 .../testsuite/libphobos.config/test19433.d|   7 +
 .../testsuite/libphobos.config/test20459.d|   5 +
 .../libphobos.exceptions/assert_fail.d| 564 ++
 .../libphobos.exceptions/catch_in_finally.d   | 191 ++
 .../libphobos.exceptions/future_message.d |  71 +++
 .../long_backtrace_trunc.d|  37 ++
 .../libphobos.exceptions/refcounted.d |  96 +++
 .../libphobos.exceptions/rt_trap_exceptions.d |  15 +
 .../rt_trap_exceptions_drt.d  |  11 +
 .../libphobos.exceptions/unknown_gc.d |   4 +
 libphobos/testsuite/libphobos.gc/attributes.d |  30 +
 libphobos/testsuite/libphobos.gc/forkgc.d |  36 ++
 libphobos/testsuite/libphobos.gc/forkgc2.d|  22 +
 libphobos/testsuite/libphobos.gc/nocollect.d  |  15 +
 libphobos/testsuite/libphobos.gc/precisegc.d  | 126 
 .../testsuite/libphobos.gc/recoverfree.d  |  13 +
 libphobos/testsuite/libphobos.gc/sigmaskgc.d  |  42 ++
 .../testsuite/libphobos.gc/startbackgc.d  |  22 +
 .../testsuite/libphobos.hash/test_hash.d  | 140 -
 .../testsuite/libphobos.imports/bug18193.d|   4 +
 .../testsuite/libphobos.init_fini/custom_gc.d | 203 +++
 .../testsuite/libphobos.init_fini/test18996.d |  13 +
 .../large_aggregate_destroy_21097.d   |  78 +++
 libphobos/testsuite/libphobos.shared/host.c   |   8 +
 .../libphobos.shared/link_mod_collision.d |   5 -
 libphobos/testsuite/libphobos.shared/l

[committed 18/19] testsuite: Update gdc testsuite to pass on latest version

This updates the GDC testsuite parts to be compatible with the current
language features/deprecations.  The dejagnu gdc-utils helper has also
been updated to handle the new options and directives added to the D2
testsuite tests.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
gcc/testsuite/ChangeLog:

* gdc.dg/Wcastresult2.d: Update test.
* gdc.dg/asm1.d: Likewise.
* gdc.dg/asm2.d: Likewise.
* gdc.dg/asm3.d: Likewise.
* gdc.dg/gdc282.d: Likewise.
* gdc.dg/imports/gdc170.d: Likewise.
* gdc.dg/intrinsics.d: Likewise.
* gdc.dg/pr101672.d: Likewise.
* gdc.dg/pr90650a.d: Likewise.
* gdc.dg/pr90650b.d: Likewise.
* gdc.dg/pr94777a.d: Likewise.
* gdc.dg/pr95250.d: Likewise.
* gdc.dg/pr96869.d: Likewise.
* gdc.dg/pr98277.d: Likewise.
* gdc.dg/pr98457.d: Likewise.
* gdc.dg/simd1.d: Likewise.
* gdc.dg/simd2a.d: Likewise.
* gdc.dg/simd2b.d: Likewise.
* gdc.dg/simd2c.d: Likewise.
* gdc.dg/simd2d.d: Likewise.
* gdc.dg/simd2e.d: Likewise.
* gdc.dg/simd2f.d: Likewise.
* gdc.dg/simd2g.d: Likewise.
* gdc.dg/simd2h.d: Likewise.
* gdc.dg/simd2i.d: Likewise.
* gdc.dg/simd2j.d: Likewise.
* gdc.dg/simd7951.d: Likewise.
* gdc.dg/torture/gdc309.d: Likewise.
* gdc.dg/torture/pr94424.d: Likewise.
* gdc.dg/torture/pr94777b.d: Likewise.
* lib/gdc-utils.exp (gdc-convert-args): Handle new compiler options.
(gdc-convert-test): Handle CXXFLAGS, EXTRA_OBJC_SOURCES, and ARG_SETS
test directives.
(gdc-do-test): Only import modules in the test run directory.
* gdc.dg/pr94777c.d: New test.
* gdc.dg/pr96156b.d: New test.
* gdc.dg/pr96157c.d: New test.
* gdc.dg/simd_ctfe.d: New test.
* gdc.dg/torture/simd17344.d: New test.
* gdc.dg/torture/simd20052.d: New test.
* gdc.dg/torture/simd6.d: New test.
* gdc.dg/torture/simd7.d: New test.
---
 gcc/testsuite/gdc.dg/Wcastresult2.d  |   2 +-
 gcc/testsuite/gdc.dg/asm1.d  |  18 +--
 gcc/testsuite/gdc.dg/asm2.d  |   2 +-
 gcc/testsuite/gdc.dg/asm3.d  |  10 +-
 gcc/testsuite/gdc.dg/gdc282.d|   6 +-
 gcc/testsuite/gdc.dg/imports/gdc170.d|   8 +-
 gcc/testsuite/gdc.dg/intrinsics.d|  36 +++---
 gcc/testsuite/gdc.dg/pr101672.d  |   2 +-
 gcc/testsuite/gdc.dg/pr90650a.d  |   2 +-
 gcc/testsuite/gdc.dg/pr90650b.d  |   2 +-
 gcc/testsuite/gdc.dg/pr94777a.d  |   2 +-
 gcc/testsuite/gdc.dg/pr94777c.d  |  62 +++
 gcc/testsuite/gdc.dg/pr95250.d   |   2 +-
 gcc/testsuite/gdc.dg/pr96156b.d  |  17 +++
 gcc/testsuite/gdc.dg/pr96157c.d  |  40 +++
 gcc/testsuite/gdc.dg/pr96869.d   |  26 ++---
 gcc/testsuite/gdc.dg/pr98277.d   |   2 +-
 gcc/testsuite/gdc.dg/pr98457.d   |   6 +-
 gcc/testsuite/gdc.dg/simd1.d |   8 --
 gcc/testsuite/gdc.dg/simd2a.d|   8 --
 gcc/testsuite/gdc.dg/simd2b.d|   8 --
 gcc/testsuite/gdc.dg/simd2c.d|   8 --
 gcc/testsuite/gdc.dg/simd2d.d|   8 --
 gcc/testsuite/gdc.dg/simd2e.d|   8 --
 gcc/testsuite/gdc.dg/simd2f.d|   8 --
 gcc/testsuite/gdc.dg/simd2g.d|   8 --
 gcc/testsuite/gdc.dg/simd2h.d|   8 --
 gcc/testsuite/gdc.dg/simd2i.d|   8 --
 gcc/testsuite/gdc.dg/simd2j.d|   8 --
 gcc/testsuite/gdc.dg/simd7951.d  |   1 +
 gcc/testsuite/gdc.dg/simd_ctfe.d |  87 +++
 gcc/testsuite/gdc.dg/torture/gdc309.d|   1 +
 gcc/testsuite/gdc.dg/torture/pr94424.d   |  16 +++
 gcc/testsuite/gdc.dg/torture/pr94777b.d  | 135 ---
 gcc/testsuite/gdc.dg/torture/simd17344.d |  11 ++
 gcc/testsuite/gdc.dg/torture/simd20052.d |  17 +++
 gcc/testsuite/gdc.dg/torture/simd6.d |  26 +
 gcc/testsuite/gdc.dg/torture/simd7.d |  18 +++
 gcc/testsuite/lib/gdc-utils.exp  |  81 --
 39 files changed, 435 insertions(+), 291 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr94777c.d
 create mode 100644 gcc/testsuite/gdc.dg/pr96156b.d
 create mode 100644 gcc/testsuite/gdc.dg/pr96157c.d
 create mode 100644 gcc/testsuite/gdc.dg/simd_ctfe.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/simd17344.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/simd20052.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/simd6.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/simd7.d

diff --git a/gcc/testsuite/gdc.dg/Wcastresult2.d 
b/gcc/testsuite/gdc.dg/Wcastresult2.d
index 56d2dd20e82..83d189a6adf 100644
--- a/gcc/testsuite/gdc.dg/Wcastresult2.d
+++ b/gcc/testsuite/gdc.dg/Wcastresult2.d
@@ -1,5 +1,5 @@
 // { dg-do compile }
-// { dg-options "-Wcast-result" }
+// { dg-options "-Wcast-result -Wno-deprecated" }
 
 vo

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

On 11/30/21 2:37 AM, Richard Biener wrote:
> On Mon, Nov 29, 2021 at 11:56 PM Qing Zhao  wrote:
> I think that's inconsistent indeed.  Peter, what are "opaque"
> registers?  rs6000-modes.def suggests
> that there's __vector_pair and __vector_quad, what's the GIMPLE types
> for those?  It seems they
> are either SSA names or expanded to pseudo registers but there's no
> constants for them.

The __vector_pair and __vector_quad types are target specific types
for use with our Matrix-Math-Assist (MMA) unit and they are only
usable with our associated MMA built-in functions.  What they hold
is really dependent on which MMA built-ins you use on them.
You can think of them a generic (and large) vector type where the
subtype is undefined...or defined by which built-in function you
happen to be using.

We do not have any constants defined for them.  How we initialize them
is either by loading values from memory into them or by zeroing them
out using the xxsetaccz instruction (only for __vector_quads).

> Can they be initialized?  I see they can be copied at least.

__vector_quads can be zero initialized using the __builtin_mma_xxsetaccz()
built-in function.  We don't have a method (or use case) for zero initializing
__vector_pairs.

> If such "things" cannot be initialized they should indeed be exempt
> from auto-init.  The
> documentation suggests that they act as bit-bucked but even bit-buckets should
> be initializable, thus why exactly does CONST0_RTX not exist for them?

We used to have CONST0_RTX defined (but nothing else), but we had problems
with the compiler CSEing the initialization for multiple __vector_quads and
then copying the values around.  We'd end up with one xxsetaccz instruction
and copies out of that accumulator register into the other accumulator
registers.  Copies are VERY expensive, while xxsetaccz's are cheap, so we
don't want that.  That said, I think a fix I put in to disable fwprop on
these types may have been the culprit for that problem, so maybe we could
add the CONST0_RTX back?  I'd have to verify that.  If so, then we'd at least
be able to support -ftrivial-auto-var-init=zero.  The =pattern version
would be more problematical...unless the value for pattern was loaded from
memory.

Peter

[ping^6] Make sure that we get unique test names if several DejaGnu directives refer to the same line [PR102735]

Hi!

I know I'm late this week ;-\ -- but here is another ping.


Grüße
 Thomas


On 2021-11-22T11:27:49+0100, Thomas Schwinge  wrote:
> Hi!
>
> Ping.
>
>
> Grüße
>  Thomas
>
>
> On 2021-11-15T15:50:58+0100, I wrote:
>> Hi!
>>
>> ..., and here is another ping.
>>
>>
>> Grüße
>>  Thomas
>>
>>
>> On 2021-11-08T11:45:12+0100, I wrote:
>>> Hi!
>>>
>>> Ping, once more.
>>>
>>>
>>> Grüße
>>>  Thomas
>>>
>>>
>>> On 2021-10-14T12:12:41+0200, I wrote:
 Hi!

 Ping, again.

 Commit log updated for 
 "privatization-1-compute.c results in both XFAIL and PASS".


 Grüße
  Thomas


 On 2021-09-30T08:42:25+0200, I wrote:
> Hi!
>
> Ping.
>
> On 2021-09-22T13:03:46+0200, I wrote:
>> On 2021-09-19T11:35:00-0600, Jeff Law via Gcc-patches 
>>  wrote:
>>> A couple of goacc tests do not have unique names.
>>
>> Thanks for fixing this up, and sorry, largely my "fault", I suppose.  ;-|
>>
>>> This causes problems
>>> for the test comparison script when one of the test passes and the other
>>> fails -- in this scenario the test comparison script claims there is a
>>> regression.
>>
>> So I understand correctly that this is a problem not just for actual
>> mixed PASS vs. FAIL (which we'd like you to report anyway!) that appear
>> for the same line, but also for mixed PASS vs. XFAIL?  (Because, the
>> latter appears to be what you're addressing with your commit here.)
>>
>>> This slipped through for a while because I had turned off x86_64 testing
>>> (others test it regularly and I was revamping the tester's hardware
>>> requirements).  Now that I've acquired more x86_64 resources and turned
>>> on native x86 testing again, it's been flagged.
>>
>> (I don't follow that argument -- these test cases should be all generic?
>> Anyway, not important, I guess.)
>>
>>> This patch just adds a numeric suffix to the TODO string to disambiguate
>>> them.
>>
>> So, instead of doing this manually (always error-prone!), like you've...
>>
>>> Committed to the trunk,
>>
>>> commit f75b237254f32d5be32c9d9610983b777abea633
>>> Author: Jeff Law 
>>> Date:   Sun Sep 19 13:31:32 2021 -0400
>>>
>>> [committed] Make test names unique for a couple of goacc tests
>>
>>> --- a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
>>> +++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
>>> @@ -39,9 +39,9 @@ contains
>>>!$acc atomic write ! ... to force 'TREE_ADDRESSABLE'.
>>>y = a
>>>  !$acc end parallel
>>> -! { dg-note {variable 'i' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* 
>>> } l_compute$c_compute }
>>> -! { dg-note {variable 'j' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* 
>>> } l_compute$c_compute }
>>> -! { dg-note {variable 'a' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* 
>>> } l_compute$c_compute }
>>> +! { dg-note {variable 'i' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO2" { xfail 
>>> *-*-* } l_compute$c_compute }
>>> +! { dg-note {variable 'j' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO3" { xfail 
>>> *-*-* } l_compute$c_compute }
>>> +! { dg-note {variable 'a' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO4" { xfail 
>>> *-*-* } l_compute$c_compute }
>>
>> ... etc. (also similarly in a handful of earlier commits, if I remember
>> correctly), why don't we do that programmatically, like in the attached
>> "Make sure that we get unique test names if several DejaGnu directives
>> refer to the same line", once and for all?  OK to push after proper
>> testing?
>
> Attached again, for easy reference.
>
> I figure it may help if I showed an example of how this changes things;
> for the test case cited above (word-diff):
>
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 40+} (test for warnings, line 39)
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 41+} (test for warnings, line 22)
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 42+} (test for warnings, line 39)
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 43+} (test for warnings, line 22)
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 44+} (test for warnings, line 39)
> PASS: gfortr

[committed 19/19] libphobos: Update libphobos testsuite to pass on latest version

This adds new, or updates the dejagu testing scripts for the suite of
libphobos tests.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* testsuite/lib/libphobos.exp (libphobos-dg-test): Handle assembly
compile types.
(dg-test): Override.
(additional_prunes): Define.
(libphobos-dg-prune): Filter any additional_prunes set by tests.
* testsuite/libphobos.druntime/druntime.exp (version_flags): Add
-fversion=CoreUnittest.
* testsuite/libphobos.druntime_shared/druntime_shared.exp
(version_flags): Add -fversion=CoreUnittest -fversion=Shared.
* testsuite/libphobos.phobos/phobos.exp (version_flags): Add
-fversion=StdUnittest
* testsuite/libphobos.phobos_shared/phobos_shared.exp (version_flags):
Likewise.
* testsuite/testsuite_flags.in: Add -fpreview=dip1000 to --gdcflags.
* testsuite/libphobos.betterc/betterc.exp: New test.
* testsuite/libphobos.config/config.exp: New test.
* testsuite/libphobos.gc/gc.exp: New test.
* testsuite/libphobos.imports/imports.exp: New test.
* testsuite/libphobos.lifetime/lifetime.exp: New test.
* testsuite/libphobos.unittest/unittest.exp: New test.
---
 libphobos/testsuite/lib/libphobos.exp | 60 +++
 .../testsuite/libphobos.betterc/betterc.exp   | 27 +
 .../testsuite/libphobos.config/config.exp | 46 ++
 .../testsuite/libphobos.druntime/druntime.exp |  2 +-
 .../druntime_shared.exp   |  2 +-
 libphobos/testsuite/libphobos.gc/gc.exp   | 27 +
 .../testsuite/libphobos.imports/imports.exp   | 29 +
 .../testsuite/libphobos.lifetime/lifetime.exp | 27 +
 .../testsuite/libphobos.phobos/phobos.exp |  2 +-
 .../libphobos.phobos_shared/phobos_shared.exp |  2 +-
 .../testsuite/libphobos.unittest/unittest.exp | 53 
 libphobos/testsuite/testsuite_flags.in|  2 +-
 12 files changed, 274 insertions(+), 5 deletions(-)
 create mode 100644 libphobos/testsuite/libphobos.betterc/betterc.exp
 create mode 100644 libphobos/testsuite/libphobos.config/config.exp
 create mode 100644 libphobos/testsuite/libphobos.gc/gc.exp
 create mode 100644 libphobos/testsuite/libphobos.imports/imports.exp
 create mode 100644 libphobos/testsuite/libphobos.lifetime/lifetime.exp
 create mode 100644 libphobos/testsuite/libphobos.unittest/unittest.exp

diff --git a/libphobos/testsuite/lib/libphobos.exp 
b/libphobos/testsuite/lib/libphobos.exp
index 2af430a0e45..66e3e80105f 100644
--- a/libphobos/testsuite/lib/libphobos.exp
+++ b/libphobos/testsuite/lib/libphobos.exp
@@ -54,6 +54,10 @@ proc libphobos-dg-test { prog do_what extra_tool_flags } {
 
 # Set up the compiler flags, based on what we're going to do.
 switch $do_what {
+   "compile" {
+   set compile_type "assembly"
+   set output_file "[file rootname [file tail $prog]].s"
+   }
"run" {
set compile_type "executable"
# FIXME: "./" is to cope with "." not being in $PATH.
@@ -89,8 +93,52 @@ proc libphobos-dg-test { prog do_what extra_tool_flags } {
 return [list $comp_output $output_file]
 }
 
+# Override the DejaGnu dg-test in order to clear flags after a test, as
+# is done for compiler tests in gcc-dg.exp.
+
+if { [info procs saved-dg-test] == [list] } {
+rename dg-test saved-dg-test
+
+proc dg-test { args } {
+   global additional_prunes
+   global errorInfo
+   global testname_with_flags
+   global shouldfail
+
+   if { [ catch { eval saved-dg-test $args } errmsg ] } {
+   set saved_info $errorInfo
+   set additional_prunes ""
+   set shouldfail 0
+   if [info exists testname_with_flags] {
+   unset testname_with_flags
+   }
+   unset_timeout_vars
+   error $errmsg $saved_info
+   }
+   set additional_prunes ""
+   set shouldfail 0
+   unset_timeout_vars
+   if [info exists testname_with_flags] {
+   unset testname_with_flags
+   }
+}
+}
+
+# Prune messages from gdc that aren't useful.
+
+set additional_prunes ""
+
 proc libphobos-dg-prune { system text } {
 
+global additional_prunes
+
+foreach p $additional_prunes {
+   if { [string length $p] > 0 } {
+   # Following regexp matches a complete line containing $p.
+   regsub -all "(^|\n)\[^\n\]*$p\[^\n\]*" $text "" text
+   }
+}
+
 # Ignore harmless warnings from Xcode.
 regsub -all "(^|\n)\[^\n\]*ld: warning: could not create compact unwind 
for\[^\n\]*" $text "" text
 
@@ -281,6 +329,18 @@ proc libphobos_skipped_test_p { test } {
 return "skipped test"
 }
 
+# Prune any messages matching ARGS[1] (a regexp) from test output.
+proc dg-prune-output { args } {
+global additional_prunes
+
+if { [llength $args] != 2 } {
+   error "[lindex $arg

Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-30 Thread Marek Polacek via Gcc-patches

On Tue, Nov 30, 2021 at 04:00:01PM +0100, Stephan Bergmann wrote:
> On 30/11/2021 14:26, Marek Polacek wrote:
> > On Tue, Nov 30, 2021 at 09:38:57AM +0100, Stephan Bergmann wrote:
> > > On 15/11/2021 18:28, Marek Polacek via Gcc-patches wrote:
> > > > On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > > > > Ping, can we conclude on the name?   IMHO, -Wbidirectional is just 
> > > > > fine,
> > > > > but changing the name is a trivial operation.
> > > > 
> > > > Here's a patch with a better name (suggested by Jonathan W.).  
> > > > Otherwise no
> > > > changes.
> > > > 
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > 
> > > > -- >8 --
> > > >   From a link below:
> > > > "An issue was discovered in the Bidirectional Algorithm in the Unicode
> > > > Specification through 14.0. It permits the visual reordering of
> > > > characters via control sequences, which can be used to craft source code
> > > > that renders different logic than the logical ordering of tokens
> > > > ingested by compilers and interpreters. Adversaries can leverage this to
> > > > encode source code for compilers accepting Unicode such that targeted
> > > > vulnerabilities are introduced invisibly to human reviewers."
> > > > 
> > > > More info:
> > > > https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> > > > https://trojansource.codes/
> > > > 
> > > > This is not a compiler bug.  However, to mitigate the problem, this 
> > > > patch
> > > > implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
> > > > misleading Unicode bidirectional characters the preprocessor may 
> > > > encounter.
> > > > 
> > > > The default is =unpaired, which warns about improperly terminated
> > > > bidirectional characters; e.g. a LRE without its appertaining PDF.  The
> > > > level =any warns about any use of bidirectional characters.
> > > > 
> > > > This patch handles both UCNs and UTF-8 characters.  UCNs designating
> > > > bidi characters in identifiers are accepted since r204886.  Then r217144
> > > > enabled -fextended-identifiers by default.  Extended characters in C/C++
> > > > identifiers have been accepted since r275979.  However, this patch still
> > > > warns about mixing UTF-8 and UCN bidi characters; there seems to be no
> > > > good reason to allow mixing them.
> > > 
> > > I wonder what the rationale is to warn about UCNs, like in
> > > 
> > > >  aText = u"\u202D" + aText;
> > > 
> > > (as found in the LibreOffice source code).
> > 
> > Is this line mixing a UCN and a UTF-8?  Or is it just that you're
> > prepending a LRO to aText?  We warn because the LRO is not "closed"
> > in the context of its string literal, which was part of the Trojan
> > source attack.  So "\u202D ... \u202C" would not warn.
> > 
> > I'm not sure what workaround I could offer.  Maybe provide an option not to
> > warn about UCNs at all, though even that is potentially dangerous -- while
> > you can see UCNs in the source code, if you print strings containing them,
> > they won't be visible anymore.
> 
> I'm not sure what you mean with "mixing a UCN and a UTF-8", but what the
> code apparently does is programmatically constructing a larger piece of text
> by prepending LRO to an existing piece of text.
> 
> My understanding is that Trojan Source is concerned with presentation of
> program source code and not with properties of Unicode text constructed
> during the execution of such a program, and from the documentation quoted
> above I understand that -Wbidi-chars is meant to address Trojan Source, so I
> don't understand why you're concerned here with what happens "if you print
> strings containing [UCNs in the source code]".
> 
> Short of a source code viewer that interprets UCNs in C/C++ source code and
> renders them in the same way as their corresponding Unicode characters, I
> don't think that UCNs are relevant for Trojan Source, and don't understand
> why -Wbidi-chars would warn about them.

I guess we were concerned with programs that generate other programs.
Maybe UCNs should be ignored by default.  There's still time to adjust
the behavior.
 
> (Also, I noticed that it doesn't work to silence -Werror=bidi-chars= with a
> 
> > #pragma GCC diagnostic ignored "-Wbidi-chars"

Yeah, it doesn't work with C++, it's https://gcc.gnu.org/PR53431 :(

Marek

[PATCH] Allow loop crossing paths in back threader copier.

2021-11-30 Thread Aldy Hernandez via Gcc-patches

We are currently restricting loop crossing paths in the generic copier
used by the back threader, but we should be able to handle them after
loop_done has completed.

This fixes the PR at -O2, though the problem remains at -O1 because we
have no threaders smart enough to elide the undefined read.  DOM3 could
be a candidate when it is converted to either a hybrid threader or
replaced with the backward threader (when ranger can handle floats).

Tested on x86-64 Linux.

OK for trunk?

PR tree-optimization/80548

gcc/ChangeLog:

* attribs.c (sorted_attr_string): Add assert for -Wstringop-overread.
* tree-ssa-threadupdate.c
(back_jt_path_registry::duplicate_thread_path): Allow paths that
cross loops after loop_done.
(back_jt_path_registry::update_cfg): Diagnose dropped threads
after duplicate_thread_path.

gcc/testsuite/ChangeLog:

* gcc.dg/pr80548.c: New test.
---
 gcc/attribs.c  |  1 +
 gcc/testsuite/gcc.dg/pr80548.c | 23 +++
 gcc/tree-ssa-threadupdate.c| 19 +++
 3 files changed, 35 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr80548.c

diff --git a/gcc/attribs.c b/gcc/attribs.c
index c252f5af07b..9a079b8405a 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -1035,6 +1035,7 @@ sorted_attr_string (tree arglist)
   attr_str[str_len_sum + len] = TREE_CHAIN (arg) ? ',' : '\0';
   str_len_sum += len + 1;
 }
+  gcc_assert (arglist);
 
   /* Replace "=,-" with "_".  */
   for (i = 0; i < strlen (attr_str); i++)
diff --git a/gcc/testsuite/gcc.dg/pr80548.c b/gcc/testsuite/gcc.dg/pr80548.c
new file mode 100644
index 000..232743e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr80548.c
@@ -0,0 +1,23 @@
+// { dg-do compile }
+// { dg-options "-O2 -Wuninitialized" }
+
+int g (void);
+void h (int, int);
+
+void f (int b)
+{
+  int x, y;
+
+  if (b)
+{
+  x = g ();
+  y = g ();
+}
+
+  while (g ())
+if (b)
+  {
+h (x, y); // { dg-bogus "uninit" }
+y = g ();
+  }
+}
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 8aac733ac25..b194c11e23d 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2410,13 +2410,14 @@ back_jt_path_registry::duplicate_thread_path (edge 
entry,
  missuses of the functions.  I.e. if you ask to copy something weird,
  it will work, but the state of structures probably will not be
  correct.  */
-  for (i = 0; i < n_region; i++)
-{
-  /* We do not handle subloops, i.e. all the blocks must belong to the
-same loop.  */
-  if (region[i]->loop_father != loop)
-   return false;
-}
+  if (!(cfun->curr_properties & PROP_loop_opts_done))
+for (i = 0; i < n_region; i++)
+  {
+   /* We do not handle subloops, i.e. all the blocks must belong to the
+  same loop.  */
+   if (region[i]->loop_father != loop)
+ return false;
+  }
 
   initialize_original_copy_tables ();
 
@@ -2651,9 +2652,11 @@ back_jt_path_registry::update_cfg (bool 
/*peel_loop_headers*/)
  visited_starting_edges.add (entry);
  retval = true;
  m_num_threaded_edges++;
+ path->release ();
}
+  else
+   cancel_thread (path, "Failure in duplicate_thread_path");
 
-  path->release ();
   m_paths.unordered_remove (0);
   free (region);
 }
-- 
2.31.1

Re: [PATCH] libcpp, v2: Fix up #__VA_OPT__ handling [PR103415]


On 11/30/21 09:13, Jakub Jelinek wrote:

On Mon, Nov 29, 2021 at 07:28:10PM -0500, Jason Merrill wrote:

Please add some of this explanation to the "paste any tokens" comment in the
code.


Ok.


+ while (rhs->flags & PASTE_LEFT);
+ if ((flags & PREV_WHITE)
+ && (token->flags & PREV_WHITE) == 0)
+   const_cast(token)->flags
+ |= PREV_WHITE;


Hmm, shouldn't paste_tokens handle copying PREV_WHITE?


Copying there PREV_FALLTHROUGH fixes the new Wimplicit-fallthrough-38.c
testcase, I couldn't find where doing the copying of PREV_WHITE would
make an observable difference outside of __VA_OPT__, e.g.
#define F(x) #x
#define G(x) F(x)
#define H G({a##b)
#define I G({ a##b)
const char *h = H;
const char *i = I;
results in "{ab" and "{ ab" before/after the patch.  But copying it
in paste_tokens looks cleaner...


OK, thanks.


2021-11-30  Jakub Jelinek  

PR preprocessor/103415
libcpp/
* macro.c (stringify_arg): Remove va_opt argument and va_opt handling.
(paste_tokens): On successful paste or in PREV_WHITE and
PREV_FALLTHROUGH flags from the *plhs token to the new token.
(replace_args): Adjust stringify_arg callers.  For #__VA_OPT__,
perform token pasting in a separate loop before stringify_arg call.
gcc/testsuite/
* c-c++-common/cpp/va-opt-8.c: New test.
* c-c++-common/Wimplicit-fallthrough-38.c: New test.

--- libcpp/macro.c.jj   2021-11-26 10:09:50.278020239 +0100
+++ libcpp/macro.c  2021-11-30 14:05:25.274132482 +0100
@@ -295,7 +295,7 @@ static cpp_context *next_context (cpp_re
  static const cpp_token *padding_token (cpp_reader *, const cpp_token *);
  static const cpp_token *new_string_token (cpp_reader *, uchar *, unsigned 
int);
  static const cpp_token *stringify_arg (cpp_reader *, const cpp_token **,
-  unsigned int, bool);
+  unsigned int);
  static void paste_all_tokens (cpp_reader *, const cpp_token *);
  static bool paste_tokens (cpp_reader *, location_t,
  const cpp_token **, const cpp_token *);
@@ -834,8 +834,7 @@ cpp_quote_string (uchar *dest, const uch
  /* Convert a token sequence FIRST to FIRST+COUNT-1 to a single string token
 according to the rules of the ISO C #-operator.  */
  static const cpp_token *
-stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count,
-  bool va_opt)
+stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count)
  {
unsigned char *dest;
unsigned int i, escape_it, backslash_count = 0;
@@ -852,24 +851,6 @@ stringify_arg (cpp_reader *pfile, const
  {
const cpp_token *token = first[i];
  
-  if (va_opt && (token->flags & PASTE_LEFT))

-   {
- location_t virt_loc = pfile->invocation_location;
- const cpp_token *rhs;
- do
-   {
- if (i == count)
-   abort ();
- rhs = first[++i];
- if (!paste_tokens (pfile, virt_loc, &token, rhs))
-   {
- --i;
- break;
-   }
-   }
- while (rhs->flags & PASTE_LEFT);
-   }
-
if (token->type == CPP_PADDING)
{
  if (source == NULL
@@ -1003,6 +984,7 @@ paste_tokens (cpp_reader *pfile, locatio
return false;
  }
  
+  lhs->flags |= (*plhs)->flags & (PREV_WHITE | PREV_FALLTHROUGH);

*plhs = lhs;
_cpp_pop_buffer (pfile);
return true;
@@ -1945,8 +1927,7 @@ replace_args (cpp_reader *pfile, cpp_has
if (src->flags & STRINGIFY_ARG)
  {
if (!arg->stringified)
- arg->stringified = stringify_arg (pfile, arg->first, arg->count,
-   false);
+ arg->stringified = stringify_arg (pfile, arg->first, arg->count);
  }
else if ((src->flags & PASTE_LEFT)
 || (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT)))
@@ -2066,11 +2047,46 @@ replace_args (cpp_reader *pfile, cpp_has
{
  unsigned int count
= start ? paste_flag - start : tokens_buff_count (buff);
- const cpp_token *t
-   = stringify_arg (pfile,
-start ? start + 1
-: (const cpp_token **) (buff->base),
-count, true);
+ const cpp_token **first
+   = start ? start + 1
+   : (const cpp_token **) (buff->base);
+ unsigned int i, j;
+
+ /* Paste any tokens that need to be pasted before calling
+stringify_arg, because stringify_arg uses pfile->u_buff
+which paste_tokens can use as well.  */
+

Re: [PATCH]AArch64 Optimize right shift rounding narrowing

2021-11-30 Thread Richard Sandiford via Gcc-patches

Tamar Christina  writes:
> Hi All,
>
> This optimizes right shift rounding narrow instructions to
> rounding add narrow high where one vector is 0 when the shift amount is half
> that of the original input type.
>
> i.e.
>
> uint32x4_t foo (uint64x2_t a, uint64x2_t b)
> {
>   return vrshrn_high_n_u64 (vrshrn_n_u64 (a, 32), b, 32);
> }
>
> now generates:
>
> foo:
> moviv3.4s, 0
> raddhn  v0.2s, v2.2d, v3.2d
> raddhn2 v0.4s, v2.2d, v3.2d
>
> instead of:
>
> foo:
> rshrn   v0.2s, v0.2d, 32
> rshrn2  v0.4s, v1.2d, 32
> ret
>
> On Arm cores this is an improvement in both latency and throughput.
> Because a vector zero is needed I created a new method
> aarch64_gen_shareable_zero that creates zeros using V4SI and then takes a 
> subreg
> of the zero to the desired type.  This allows CSE to share all the zero
> constants.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?

LGTM.  Just a couple of nits:

>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-protos.h (aarch64_gen_shareable_zero): New.
>   * config/aarch64/aarch64-simd.md (aarch64_rshrn,
>   aarch64_rshrn2): 

Missing description.

>   * config/aarch64/aarch64.c (aarch64_gen_shareable_zero): New.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/advsimd-intrinsics/shrn-1.c: New test.
>   * gcc.target/aarch64/advsimd-intrinsics/shrn-2.c: New test.
>   * gcc.target/aarch64/advsimd-intrinsics/shrn-3.c: New test.
>   * gcc.target/aarch64/advsimd-intrinsics/shrn-4.c: New test.
>
> --- inline copy of patch -- 
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> f7887d06139f01c1591c4e755538d94e5e608a52..f7f5cae82bc9198e54d0298f25f7c0f5902d5fb1
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -846,6 +846,7 @@ const char *aarch64_output_move_struct (rtx *operands);
>  rtx aarch64_return_addr_rtx (void);
>  rtx aarch64_return_addr (int, rtx);
>  rtx aarch64_simd_gen_const_vector_dup (machine_mode, HOST_WIDE_INT);
> +rtx aarch64_gen_shareable_zero (machine_mode);
>  bool aarch64_simd_mem_operand_p (rtx);
>  bool aarch64_sve_ld1r_operand_p (rtx);
>  bool aarch64_sve_ld1rq_operand_p (rtx);
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> c71658e2bf52b26bf9fc9fa702dd5446447f4d43..d7f8694add540e32628893a7b7471c08de6f760f
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1956,20 +1956,32 @@ (define_expand "aarch64_rshrn"
> (match_operand:SI 2 "aarch64_simd_shift_imm_offset_")]
>"TARGET_SIMD"
>{
> -operands[2] = aarch64_simd_gen_const_vector_dup (mode,
> -  INTVAL (operands[2]));
> -rtx tmp = gen_reg_rtx (mode);
> -if (BYTES_BIG_ENDIAN)
> -  emit_insn (gen_aarch64_rshrn_insn_be (tmp, operands[1],
> - operands[2], CONST0_RTX (mode)));
> +if (INTVAL (operands[2]) == GET_MODE_UNIT_BITSIZE (mode))
> +  {
> + rtx tmp0 = aarch64_gen_shareable_zero (mode);
> + emit_insn (gen_aarch64_raddhn (operands[0], operands[1], tmp0));
> +  }
>  else
> -  emit_insn (gen_aarch64_rshrn_insn_le (tmp, operands[1],
> - operands[2], CONST0_RTX (mode)));
> -
> -/* The intrinsic expects a narrow result, so emit a subreg that will get
> -   optimized away as appropriate.  */
> -emit_move_insn (operands[0], lowpart_subreg (mode, tmp,
> -  mode));
> +  {
> + rtx tmp = gen_reg_rtx (mode);
> + operands[2] = aarch64_simd_gen_const_vector_dup (mode,
> +  INTVAL (operands[2]));
> + if (BYTES_BIG_ENDIAN)
> +   emit_insn (
> + gen_aarch64_rshrn_insn_be (tmp, operands[1],
> +  operands[2],
> +  CONST0_RTX (mode)));
> + else
> +   emit_insn (
> + gen_aarch64_rshrn_insn_le (tmp, operands[1],
> +  operands[2],
> +  CONST0_RTX (mode)));
> +
> + /* The intrinsic expects a narrow result, so emit a subreg that will
> +get optimized away as appropriate.  */
> + emit_move_insn (operands[0], lowpart_subreg (mode, tmp,
> +  mode));
> +  }
>  DONE;
>}
>  )
> @@ -2049,14 +2061,27 @@ (define_expand "aarch64_rshrn2"
> (match_operand:SI 3 "aarch64_simd_shift_imm_offset_")]
>"TARGET_SIMD"
>{
> -operands[3] = aarch64_simd_gen_const_vector_dup (mode,
> -  INTVAL (operands[3]));
> -if (BYTES_BIG_ENDIAN)
> -  emit_insn (gen_aarch64_rshrn2_insn_be (operands

Re: [PATCH 2/5]AArch64 sve: combine nested if predicates

2021-11-30 Thread Richard Sandiford via Gcc-patches

Tamar Christina  writes:
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-linux-gnu and no 
> issues.
>
> gcc/ChangeLog:
>
>   * tree-vect-stmts.c (prepare_load_store_mask): Rename to...
>   (prepare_vec_mask): ...This and record operations that have already been
>   masked.
>   (vectorizable_call): Use it.
>   (vectorizable_operation): Likewise.
>   (vectorizable_store): Likewise.
>   (vectorizable_load): Likewise.
>   * tree-vectorizer.c (vec_cond_masked_key::get_cond_ops_from_tree): New.
>   * tree-vectorizer.h (struct vec_cond_masked_key): New.
>   (class _loop_vec_info): Add vec_cond_masked_set.
>   (vec_cond_masked_set_type): New.
>   (struct default_hash_traits): New.
>
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/sve/pred-combine-and.c: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pred-combine-and.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pred-combine-and.c
> new file mode 100644
> index 
> ..ee927346abe518caa3cba397b11dfd1ee7e93630
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pred-combine-and.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +void f5(float * restrict z0, float * restrict z1, float *restrict x, float * 
> restrict y, float c, int n)
> +{
> +for (int i = 0; i < n; i++) {
> +float a = x[i];
> +float b = y[i];
> +if (a > b) {
> +z0[i] = a + b;
> +if (a > c) {
> +z1[i] = a - b;
> +}
> +}
> +}
> +}
> +
> +/* { dg-final { scan-assembler-times {\tfcmgt\tp[0-9]+\.s, p[0-9]+/z, 
> z[0-9]+\.s, z[0-9]+\.s} 2 } } */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 
> 2284ad069e4d521f4e0bd43d34181a258cd636ef..b1946b589043312a9b29d832f9b8398e24787a5f
>  100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -1796,23 +1796,30 @@ check_load_store_for_partial_vectors (loop_vec_info 
> loop_vinfo, tree vectype,
>  /* Return the mask input to a masked load or store.  VEC_MASK is the 
> vectorized
> form of the scalar mask condition and LOOP_MASK, if nonnull, is the mask
> that needs to be applied to all loads and stores in a vectorized loop.
> -   Return VEC_MASK if LOOP_MASK is null, otherwise return VEC_MASK & 
> LOOP_MASK.
> +   Return VEC_MASK if LOOP_MASK is null or if VEC_MASK is already masked,
> +   otherwise return VEC_MASK & LOOP_MASK.
>  
> MASK_TYPE is the type of both masks.  If new statements are needed,
> insert them before GSI.  */
>  
>  static tree
> -prepare_load_store_mask (tree mask_type, tree loop_mask, tree vec_mask,
> -  gimple_stmt_iterator *gsi)
> +prepare_vec_mask (tree mask_type, loop_vec_info loop_vinfo, tree loop_mask,
> +   tree vec_mask, gimple_stmt_iterator *gsi)

Minor, but: loop_vinfo normally comes first when present.

>  {
>gcc_assert (useless_type_conversion_p (mask_type, TREE_TYPE (vec_mask)));
>if (!loop_mask)
>  return vec_mask;
>  
>gcc_assert (TREE_TYPE (loop_mask) == mask_type);
> +
> +  vec_cond_masked_key cond (vec_mask, loop_mask);
> +  if (loop_vinfo->vec_cond_masked_set.contains (cond))
> +return vec_mask;
> +
>tree and_res = make_temp_ssa_name (mask_type, NULL, "vec_mask_and");
>gimple *and_stmt = gimple_build_assign (and_res, BIT_AND_EXPR,
> vec_mask, loop_mask);
> +
>gsi_insert_before (gsi, and_stmt, GSI_SAME_STMT);
>return and_res;
>  }
> @@ -3526,8 +3533,9 @@ vectorizable_call (vec_info *vinfo,
> gcc_assert (ncopies == 1);
> tree mask = vect_get_loop_mask (gsi, masks, vec_num,
> vectype_out, i);
> -   vargs[mask_opno] = prepare_load_store_mask
> - (TREE_TYPE (mask), mask, vargs[mask_opno], gsi);
> +   vargs[mask_opno] = prepare_vec_mask
> + (TREE_TYPE (mask), loop_vinfo, mask,
> +  vargs[mask_opno], gsi);
>   }
>  
> gcall *call;
> @@ -3564,8 +3572,8 @@ vectorizable_call (vec_info *vinfo,
> tree mask = vect_get_loop_mask (gsi, masks, ncopies,
> vectype_out, j);
> vargs[mask_opno]
> - = prepare_load_store_mask (TREE_TYPE (mask), mask,
> -vargs[mask_opno], gsi);
> + = prepare_vec_mask (TREE_TYPE (mask), loop_vinfo, mask,
> + vargs[mask_opno], gsi);
>   }
>  
> gimple *new_stmt;
> @@ -6302,10 +6310,46 @@ vectorizable_operation (vec_info *vinfo,
>   }
>else
>   {
> +   tree mask = NULL_TREE;
> +   /* When combining two masks check is eithe

Re: [PATCH] OpenMP: Ensure that offloaded variables are public

On Tue, Nov 16, 2021 at 11:49:18AM +, Andrew Stubbs wrote:
> This patch is needed for AMD GCN offloading when we use the assembler from
> LLVM 13+.
> 
> The GCN runtime (libgomp+ROCm) requires that the location of all variables
> in the offloaded variables table are discoverable at runtime (using the
> "hsa_executable_symbol_get_info" API), and this only works when the symbols
> are exported from the binary. Previously we solved this by having mkoffload
> insert ".global" directives into the assembler text, but newer LLVM
> assemblers emit an error if we do this when then variable was previously
> declared ".local" (which happens when a variable is zero-initialized and
> placed in the BSS).
> 
> Since we can no longer easily fix them up after the fact, this patch fixes
> them up during OMP lowering.

I'm confused, how can that ever work reliably?
The !TREE_PUBLIC offload_vars can be static locals or static globals
or static anon namespace vars, but their names can very easily clash with
either static or non-static variables from other TUs.
Consider in one TU

static int a = 5;
static int baz (void) { static int b;
#pragma omp declare target to (b)
return ++b; }
int foo (void) { return ++a + baz (); }
#pragma omp declare target to (a, foo)

and

static int a = 5;
static int baz (void) { static int b;
#pragma omp declare target to (b)
return ++b; }
int bar (void) { return ++a + baz (); }
#pragma omp declare target to (a, bar)

int
main ()
{
  int v;
  #pragma omp target (from: v)
  v = foo () + bar ();
}

in another one.  This has
.quad   a
.quad   4
.quad   b.0
.quad   4
in .offload_var_table.  I'd guess this must fail to link or load
with GCN if it makes them forcibly TREE_PUBLIC.

Why does the GCN plugin or runtime need to know those vars?
It needs to know the single array that contains their addresses of course...

Jakub

[PATCH] tree-optimization/98956 Optimizing out boolean left shift

2021-11-30 Thread Navid Rahimi via Gcc-patches

Hi GCC community,

This patch will add the missed pattern described in bug 98956 [1] to the 
match.pd. The codegen and correctness proof for this pattern is here [2,3] in 
case anyone is curious. Tested on x86_64 Linux.

Tree-optimization/98956:

Adding new optimization to match.pd:
* match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization.
* gcc.dg/tree-ssa/pr98956.c: testcase for this optimization.
* gcc.dg/tree-ssa/pr98956-2.c: testcase for node with 
side-effect.

1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98956
2) https://compiler-explorer.com/z/nj4PTrecW
3) https://alive2.llvm.org/ce/z/jyJAoS

Best wishes,
Navid.

0001-Tree-optimization-98956.patch
Description: 0001-Tree-optimization-98956.patch

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-11-30 Thread Richard Sandiford via Gcc-patches

BTW, in response to your earlier concern about stage 3: you posted the
series well in time for end of stage 1, so I think it can still go in
during stage 3.

Robin Dapp  writes:
> Hi Richard,
>
>> It's hard to judge this in isolation because it's not clear when
>> and how the new arguments are going to be used, but it seems OK
>> in principle.  Do you still want:
>> 
>>   /* If earliest == jump, try to build the cmove insn directly.
>>  This is helpful when combine has created some complex condition
>>  (like for alpha's cmovlbs) that we can't hope to regenerate
>>  through the normal interface.  */
>> 
>>   if (if_info->cond_earliest == if_info->jump)
>> {
>> 
>> to be used when cc_cmp and rev_cc_cmp are nonnull?
>
> My initial hunch was to just leave it in place as I did not manage to
> trigger it.  As it is going to be called and costed both ways (with
> cc_cmp, rev_cc_cmp and without) it is probably better to move it into
> the else branch.
>
> The single usage of this is in patch 5/7.  We are passing the already
> existing condition from the jump and its reverse to see if the backend
> can come up with something better than when creating a new comparison.
>
>>> +static rtx emit_conditional_move (rtx, rtx, rtx, rtx, machine_mode);
>>> +rtx emit_conditional_move (rtx, rtx, rtx, rtx, rtx, machine_mode);
>> 
>> This is redundant with the header file declaration.
>> 
>
> Removed it.
>
>> I think it'd be better to call one of these functions something else,
>> rather than make the interpretation of the third parameter depend on
>> the total number of parameters.  In the second overload, the comparison
>> rtx effectively replaces four parameters of the existing
>> emit_conditional_move, so perhaps that's the one that should remain
>> emit_conditional_move.  Maybe the first one should be called
>> emit_conditional_move_with_rev or something.
>
> Not entirely fond of calling the first one _with_rev because essentially
> both try normal and reversed variants but I agree that the naming is not
> ideal.  I don't have any great ideas how to properly untangle it so I
> would go with your suggestions in order to move forward.  As there is
> only one caller of the second function, we could also let the caller
> handle the reversing.  Then, the third function would need to be
> non-static, though.
>
> The third, static emit_conditional_move I already renamed locally to
> emit_conditional_move_1.

Thanks, renaming the third function helps.

>> Part of me wonders if this would be simpler if we created a structure
>> to describe a comparison and passed that around instead of individual
>> fields, but I guess it could become a rat hole.
>
> I also thought about this as it would allow us to use either
> representation as required by the usage site.  Even tried it in a branch
> locally but indeed it became ugly quickly so I postponed it for now.

Still, perhaps we could at least add (in rtl.h):

struct rtx_comparison {
  rtx_code code;
  machine_mode op_mode;
  rtx op0, op1;
};

and make the existing emit_conditional_moves use it instead of four
separate parameters.  These rtx arguments would then be replacing those
rtx_comparison arguments, which would avoid the ambiguity in the overloads.

With C++ it should be possible to rewrite the calls using { … }, e.g.:

  if (!emit_conditional_move (into_target, { cmp_code, op1_mode, cmp1, cmp2 },
  into_target, into_superword, word_mode, false))

so the new type wouldn't need to spread too far.

Does that sound OK?  If so, could you post the current version of full
patch series and say which bits still need review?

Thanks,
Richard

[PATCH] Fix alignment of stack slots for overaligned types [PR103500]

2021-11-30 Thread Alex Coplan via Gcc-patches

Hi,

This fixes PR103500 i.e. ensuring that stack slots for
passed-by-reference overaligned types are appropriately aligned. For the
testcase:

typedef struct __attribute__((aligned(32))) {
  long x,y;
} S;
S x;
void f(S);
void g(void) { f(x); }

on AArch64, we currently generate (at -O2):

g:
adrpx1, .LANCHOR0
add x1, x1, :lo12:.LANCHOR0
stp x29, x30, [sp, -48]!
mov x29, sp
ldp q0, q1, [x1]
add x0, sp, 16
stp q0, q1, [sp, 16]
bl  f
ldp x29, x30, [sp], 48
ret

so the stack slot for the passed-by-reference copy of the structure is
at sp + 16, and the sp is only guaranteed to be 16-byte aligned, so the
structure is only 16-byte aligned. The PCS requires the structure to be
32-byte aligned. After this patch, we generate:

g:
adrpx1, .LANCHOR0
add x1, x1, :lo12:.LANCHOR0
stp x29, x30, [sp, -64]!
mov x29, sp
add x0, sp, 47
ldp q0, q1, [x1]
and x0, x0, -32
stp q0, q1, [x0]
bl  f
ldp x29, x30, [sp], 64
ret

i.e. we ensure 32-byte alignment for the struct.

The approach taken here is similar to that in
function.c:assign_parm_setup_block where it handles the case for
DECL_ALIGN (parm) > MAX_SUPPORTED_STACK_ALIGNMENT. This in turn is
similar to the approach taken in cfgexpand.c:expand_stack_vars (where
the function calls get_dynamic_stack_size) which is the code that
handles the alignment for overaligned structures as addressable local
variables (see the related case discussed in the PR).

This patch also updates the aapcs64 test mentioned in the PR to avoid
the frontend folding away the alignment check. I've confirmed that the
execution test actually fails on aarch64-linux-gnu prior to the patch
being applied and passes afterwards.

Bootstrapped and regtested on aarch64-linux-gnu, x86_64-linux-gnu, and
arm-linux-gnueabihf: no regressions.

I'd appreciate any feedback. Is it OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

PR middle-end/103500
* function.c (get_stack_local_alignment): Align BLKmode overaligned
types to the alignment required by the type.
(assign_stack_temp_for_type): Handle BLKmode overaligned stack
slots by allocating a larger-than-necessary buffer and aligning
the address within appropriately.

gcc/testsuite/ChangeLog:

PR middle-end/103500
* gcc.target/aarch64/aapcs64/rec_align-8.c (test_pass_by_ref):
Prevent the frontend from folding our alignment check away by
using snprintf to store the pointer into a string and recovering
it with sscanf.
diff --git a/gcc/function.c b/gcc/function.c
index 61b3bd036b8..5ed722ab959 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -278,7 +278,9 @@ get_stack_local_alignment (tree type, machine_mode mode)
   unsigned int alignment;
 
   if (mode == BLKmode)
-alignment = BIGGEST_ALIGNMENT;
+alignment = (type && TYPE_ALIGN (type) > MAX_SUPPORTED_STACK_ALIGNMENT)
+  ? TYPE_ALIGN (type)
+  : BIGGEST_ALIGNMENT;
   else
 alignment = GET_MODE_ALIGNMENT (mode);
 
@@ -872,21 +874,35 @@ assign_stack_temp_for_type (machine_mode mode, poly_int64 
size, tree type)
 
   p = ggc_alloc ();
 
-  /* We are passing an explicit alignment request to assign_stack_local.
-One side effect of that is assign_stack_local will not round SIZE
-to ensure the frame offset remains suitably aligned.
-
-So for requests which depended on the rounding of SIZE, we go ahead
-and round it now.  We also make sure ALIGNMENT is at least
-BIGGEST_ALIGNMENT.  */
-  gcc_assert (mode != BLKmode || align == BIGGEST_ALIGNMENT);
-  p->slot = assign_stack_local_1 (mode,
- (mode == BLKmode
-  ? aligned_upper_bound (size,
- (int) align
- / BITS_PER_UNIT)
-  : size),
- align, 0);
+  if (mode == BLKmode && align > MAX_SUPPORTED_STACK_ALIGNMENT)
+   {
+ rtx allocsize = gen_int_mode (size, Pmode);
+ get_dynamic_stack_size (&allocsize, 0, align, NULL);
+ gcc_assert (CONST_INT_P (allocsize));
+ size = UINTVAL (allocsize);
+ p->slot = assign_stack_local_1 (mode,
+ size,
+ BIGGEST_ALIGNMENT, 0);
+ rtx addr = align_dynamic_address (XEXP (p->slot, 0), align);
+ mark_reg_pointer (addr, align);
+ p->slot = gen_rtx_MEM (GET_MODE (p->slot), addr);
+ MEM_NOTRAP_P (p->slot) = 1;
+   }
+  else
+   /* We are passing an explicit alignment request to assign_stack_local.
+  One side effect of that is a

Re: [PATCH] OpenMP: Ensure that offloaded variables are public

On Tue, Nov 30, 2021 at 05:24:49PM +0100, Jakub Jelinek via Gcc-patches wrote:
> Consider in one TU
> 
> static int a = 5;
> static int baz (void) { static int b;
> #pragma omp declare target to (b)
> return ++b; }
> int foo (void) { return ++a + baz (); }
> #pragma omp declare target to (a, foo)
> 
> and
> 
> static int a = 5;
> static int baz (void) { static int b;
> #pragma omp declare target to (b)
> return ++b; }
> int bar (void) { return ++a + baz (); }
> #pragma omp declare target to (a, bar)
> 
> int
> main ()
> {
>   int v;
>   #pragma omp target (from: v)
>   v = foo () + bar ();
> }
> 
> in another one.  This has
>   .quad   a
>   .quad   4
>   .quad   b.0
>   .quad   4
> in .offload_var_table.  I'd guess this must fail to link or load
> with GCN if it makes them forcibly TREE_PUBLIC.
> 
> Why does the GCN plugin or runtime need to know those vars?
> It needs to know the single array that contains their addresses of course...

Actually, you've done it in ACCEL_COMPILER only, so
I assume linking the above two sources with -fopenmp into a single
binary or shared library will still work because LTO when reading
the byte-code in will remangle the names of those variables to something
where they are unique in that single *.s (or *.ptx) it emits.
But, if you put one of those TUs into a shared library and the other
into another shared library, I don't see how it can work anymore,
because both those ELF objects which will be in data sections of those
libraries might have clashing names.

If GCN can't support static variables (but isn't it ELF?) and there is no
other way than sacrifice offloading from multiple shared libraries or binary
in the same process, it at least shouldn't be done for targets which don't
need it (e.g. PTX) and shouldn't be done in the pass you've done it in
(because that means it will walk all the vars for each function it
processes, rather than just once).  So, better place would be e.g.
offload_handle_link_vars in lto/*.c or so.

Jakub

[PING, PATCH] doc, d: Add note that D front end now requires GDC installed in order to bootstrap.

Ping.

Excerpts from Iain Buclaw's message of November 18, 2021 2:06 am:
> Hi,
> 
> As asked for, this adds the documentation note in install.texi about the
> upcoming bootstrap requirements.
> 
> Obviously this will be applied alongside the patch posted previously:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582917.html
> 
> Final batch of testing before proceeding has taking a bit longer than I
> expected.  Currently bootstrapping on sparcv9-sun-solaris2.11, and will
> push forward once have confirmed that it works as well as the current
> C++ implementation of the D front end.
> 
> OK for mainline?  Any improvements on wording?
> 
> Thanks,
> Iain.
> 
> ---
> gcc/ChangeLog:
> 
>   * doc/install.texi (Prerequisites): Add note that D front end now
>   requires GDC installed in order to bootstrap.
>   (Building): Add D compiler section, referencing prerequisites.
> ---
>  gcc/doc/install.texi | 28 
>  1 file changed, 28 insertions(+)
> 
> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> index 094469b9a4e..6f999a2fd5a 100644
> --- a/gcc/doc/install.texi
> +++ b/gcc/doc/install.texi
> @@ -289,6 +289,25 @@ Ada runtime libraries. You can check that your build 
> environment is clean
>  by verifying that @samp{gnatls -v} lists only one explicit path in each
>  section.
>  
> +@item @anchor{GDC-prerequisite}GDC
> +
> +In order to build GDC, the D compiler, you need a working GDC
> +compiler (GCC version 9.1 or later), as the D front end is written in D.
> +
> +Versions of GDC prior to 12 can be built with an ISO C++11 compiler, which 
> can
> +then be installed and used to bootstrap newer versions of the D front end.
> +
> +It is strongly recommended to use an older version of GDC to build GDC. More
> +recent versions of GDC than the version built are not guaranteed to work and
> +will often fail during the build with compilation errors relating to
> +deprecations or removed features.
> +
> +Note that @command{configure} does not test whether the GDC installation 
> works
> +and has a sufficiently recent version.  Though the implementation of the D
> +front end does not make use of any GDC-specific extensions, or novel features
> +of the D language, if too old a GDC version is installed and
> +@option{--enable-languages=d} is used, the build will fail.
> +
>  @item A ``working'' POSIX compatible shell, or GNU bash
>  
>  Necessary when running @command{configure} because some
> @@ -2977,6 +2996,15 @@ and network filesystems.
>  @uref{prerequisites.html#GNAT-prerequisite,,GNAT prerequisites}.
>  @end ifhtml
>  
> +@section Building the D compiler
> +
> +@ifnothtml
> +@ref{GDC-prerequisite}.
> +@end ifnothtml
> +@ifhtml
> +@uref{prerequisites.html#GDC-prerequisite,,GDC prerequisites}.
> +@end ifhtml
> +
>  @section Building with profile feedback
>  
>  It is possible to use profile feedback to optimize the compiler itself.  This
> -- 
> 2.30.2
> 
>

[PING, PATCH] darwin, d: Support outfile substitution for liphobos

Ping.

Are the common gcc parts OK (also for backporting)?

Iain.

Excerpts from Iain Buclaw's message of November 26, 2021 1:51 pm:
> Excerpts from Iain Sandoe's message of November 19, 2021 10:21 am:
>> Hi Iain
>> 
>>> On 19 Nov 2021, at 08:32, Iain Buclaw  wrote:
>> 
>>> This patch fixes a stage2 bootstrap failure in the D front-end on
>>> darwin due to libgphobos being dynamically linked despite
>>> -static-libphobos being on the command line.
>>> 
>>> In the gdc driver, this takes the previous fix for the Darwin D
>>> bootstrap, and extends it to the -static-libphobos option as well.
>>> Rather than pushing the -static-libphobos option back onto the command
>>> line, the setting of SKIPOPT is instead conditionally removed.  The same
>>> change has been repeated for -static-libstdc++ so there is now no need
>>> to call generate_option to re-add it.
>>> 
>>> In the gcc driver, -static-libphobos has been added as a common option,
>>> validated, and a new outfile substition added to config/darwin.h to
>>> correctly replace -lgphobos with libgphobos.a.
>>> 
>>> Bootstrapped and regression tested on x86_64-linux-gnu and
>>> x86_64-apple-darwin20.
>>> 
>>> OK for mainline?  This would also be fine for gcc-11 release branch too,
>>> as well as earlier releases with D support.
>> 
>> the Darwin parts are fine, thanks 
>> 
>> The SKIPOPT in d-spec, presumably means “skip removing this opt”?
>> otherwise the #ifndef looks odd (because of the 
>> static-libgcc|static-libphobos,
>> darwin.h would do the substitution for -static-libgcc as well, so it’s not a 
>> 100%
>> test).
>> 
> 
> I've only just realised what you meant.  Yes you are of course right,
> and it should have been #ifdef, attaching a fixed-up patch.
> 
> Iain.
> 
> ---
> gcc/ChangeLog:
> 
> * common.opt (static-libphobos): Add option.
> * config/darwin.h (LINK_SPEC): Substitute -lgphobos with 
> libgphobos.a
> when linking statically.
> * gcc.c (driver_handle_option): Set -static-libphobos as always 
> valid.
> 
> gcc/d/ChangeLog:
> 
> * d-spec.cc (lang_specific_driver): Set SKIPOPT on 
> -static-libstdc++
> and -static-libphobos only when target supports LD_STATIC_DYNAMIC.
> Remove generate_option to re-add -static-libstdc++.
> 
> libphobos/ChangeLog:
> 
> * testsuite/testsuite_flags.in: Add libphobos library directory as
> search path to --gdcldflags.
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index db6010e4e20..73c12d933f3 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3527,6 +3527,10 @@ static-libgfortran
>  Driver
>  ; Documented for Fortran, but always accepted by driver.
>  
> +static-libphobos
> +Driver
> +; Documented for D, but always accepted by driver.
> +
>  static-libstdc++
>  Driver
>  
> diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
> index 7ed01efa694..c4ddd623e8b 100644
> --- a/gcc/config/darwin.h
> +++ b/gcc/config/darwin.h
> @@ -443,6 +443,7 @@ extern GTY(()) int darwin_ms_struct;
>   %:replace-outfile(-lobjc libobjc-gnu.a%s); \
>  :%:replace-outfile(-lobjc -lobjc-gnu )}}\
> %{static|static-libgcc|static-libgfortran:%:replace-outfile(-lgfortran 
> libgfortran.a%s)}\
> +   %{static|static-libgcc|static-libphobos:%:replace-outfile(-lgphobos 
> libgphobos.a%s)}\
> 
> %{static|static-libgcc|static-libstdc++|static-libgfortran:%:replace-outfile(-lgomp
>  libgomp.a%s)}\
> %{static|static-libgcc|static-libstdc++:%:replace-outfile(-lstdc++ 
> libstdc++.a%s)}\
> %{force_cpusubtype_ALL:-arch %(darwin_arch)} \
> diff --git a/gcc/d/d-spec.cc b/gcc/d/d-spec.cc
> index b12d28f1047..1304126a675 100644
> --- a/gcc/d/d-spec.cc
> +++ b/gcc/d/d-spec.cc
> @@ -253,13 +253,23 @@ lang_specific_driver (cl_decoded_option 
> **in_decoded_options,
>  
>   case OPT_static_libstdc__:
> saw_static_libcxx = true;
> +#ifdef HAVE_LD_STATIC_DYNAMIC
> +   /* Remove -static-libstdc++ from the command only if target supports
> +  LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
> +  back-end target can use outfile substitution.  */
> args[i] |= SKIPOPT;
> +#endif
> break;
>  
>   case OPT_static_libphobos:
> if (phobos_library != PHOBOS_NOLINK)
>   phobos_library = PHOBOS_STATIC;
> +#ifdef HAVE_LD_STATIC_DYNAMIC
> +   /* Remove -static-libphobos from the command only if target supports
> +  LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
> +  back-end target can use outfile substitution.  */
> args[i] |= SKIPOPT;
> +#endif
> break;
>  
>   case OPT_shared_libphobos:
> @@ -460,7 +470,7 @@ lang_specific_driver (cl_decoded_option 
> **in_decoded_options,
>  #endif
>  }
>  
> -  if (saw_libcxx || need_stdcxx)
> +  if (saw_libcxx || saw_static_libcxx || need_stdcxx)
>  {
>  #ifdef HAVE_LD_STATIC_DYNAMIC
>

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-30 Thread Tobias Burnus


On 29.11.21 22:11, Harald Anlauf wrote:


"A whole array is a named array or a structure component whose final
part-ref is an array component name; no subscript list is appended."

I think in "h(3)" there is not really a named array – thus I read it as
if the "Otherwise ... result value is 1" applies.


If you read on in the standard:

"The appearance of a whole array variable in an executable construct
specifies all the elements of the array ..."

which might make you/makes me think that the sentence before that one
could need an official interpretation...


I am not sure whether I understand what part of the spec you wonder
about. (I mean besides that 'variable' can also mean referencing a
data-pointer-returning function.)

Question: What do NAG/flang/... report for lbound(h(3)) - also [3] – or
[1] as gfortran?


I've submitted a reduced example to the Intel Fortran Forum:
https://community.intel.com/t5/Intel-Fortran-Compiler/Allocate-with-SOURCE-and-bounds/m-p/1339992#M158535


There are good chances that Steve Lionel reads and comments on it.


So far only "FortranFan" has replied – and he comes to the same
conclusion as my reading, albeit without referring to the standard.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[committed] vect: Fix ncopies calculation for emulated gather/scatter [PR103494]

2021-11-30 Thread Richard Sandiford via Gcc-patches

I was too eager about removing ncopies calculations in g:10833849b55.
When emulating gather/scatter, the offset ncopies can be different from
the data ncopies.  This patch restores the original calculation.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Pushed as obvious,
since it's essentially reverting part of my earlier patch (except for
obvious adjustments to keep slp_node).

Richard


gcc/
PR tree-optimization/103494
* tree-vect-stmts.c (vect_get_gather_scatter_ops): Remove ncopies
argument and calculate ncopies from gs_info->offset_vectype
where necessary.
(vectorizable_store, vectorizable_load): Update accordingly.

gcc/testsuite/
PR tree-optimization/103494
* gcc.dg/vect/pr103494.c: New test.
* g++.dg/vect/pr103494.cc: Likewise.
---
 gcc/testsuite/g++.dg/vect/pr103494.cc | 26 ++
 gcc/testsuite/gcc.dg/vect/pr103494.c  | 14 ++
 gcc/tree-vect-stmts.c | 21 -
 3 files changed, 52 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/vect/pr103494.cc
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr103494.c

diff --git a/gcc/testsuite/g++.dg/vect/pr103494.cc 
b/gcc/testsuite/g++.dg/vect/pr103494.cc
new file mode 100644
index 000..c0b078105c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr103494.cc
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+void glFinish();
+struct _Vector_base {
+  struct {
+unsigned _M_start;
+  } _M_impl;
+};
+class vector : _Vector_base {
+public:
+  vector(long) {}
+  unsigned *data() { return &_M_impl._M_start; }
+};
+void *PutBitsIndexedImpl_color_table;
+int PutBitsIndexedImpl_dstRectHeight;
+char *PutBitsIndexedImpl_src_ptr;
+void PutBitsIndexedImpl() {
+  vector unpacked_buf(PutBitsIndexedImpl_dstRectHeight);
+  unsigned *dst_ptr = unpacked_buf.data();
+  for (int x; x; x++) {
+char i = *PutBitsIndexedImpl_src_ptr++;
+dst_ptr[x] = static_cast(PutBitsIndexedImpl_color_table)[i];
+  }
+  glFinish();
+}
diff --git a/gcc/testsuite/gcc.dg/vect/pr103494.c 
b/gcc/testsuite/gcc.dg/vect/pr103494.c
new file mode 100644
index 000..b544bf2379c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr103494.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+typedef int T1;
+typedef signed char T2;
+
+T1
+f (T1 *d, T2 *x, int n)
+{
+  unsigned char res = 0;
+  for (int i = 0; i < n; ++i)
+res += d[x[i]];
+  return res;
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 8642acbc0b4..9726450ab2d 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -2962,8 +2962,7 @@ vect_build_gather_load_calls (vec_info *vinfo, 
stmt_vec_info stmt_info,
 static void
 vect_get_gather_scatter_ops (loop_vec_info loop_vinfo,
 class loop *loop, stmt_vec_info stmt_info,
-slp_tree slp_node, unsigned int ncopies,
-gather_scatter_info *gs_info,
+slp_tree slp_node, gather_scatter_info *gs_info,
 tree *dataref_ptr, vec *vec_offset)
 {
   gimple_seq stmts = NULL;
@@ -2978,9 +2977,13 @@ vect_get_gather_scatter_ops (loop_vec_info loop_vinfo,
   if (slp_node)
 vect_get_slp_defs (SLP_TREE_CHILDREN (slp_node)[0], vec_offset);
   else
-vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, ncopies,
-  gs_info->offset, vec_offset,
-  gs_info->offset_vectype);
+{
+  unsigned ncopies
+   = vect_get_num_copies (loop_vinfo, gs_info->offset_vectype);
+  vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, ncopies,
+gs_info->offset, vec_offset,
+gs_info->offset_vectype);
+}
 }
 
 /* Prepare to implement a grouped or strided load or store using
@@ -8149,8 +8152,8 @@ vectorizable_store (vec_info *vinfo,
  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
{
  vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info,
-  slp_node, ncopies, &gs_info,
-  &dataref_ptr, &vec_offsets);
+  slp_node, &gs_info, &dataref_ptr,
+  &vec_offsets);
  vec_offset = vec_offsets[0];
}
  else
@@ -9454,8 +9457,8 @@ vectorizable_load (vec_info *vinfo,
  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
{
  vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info,
-  slp_node, ncopies, &gs_info,
-  &dataref_ptr, &vec_offsets);
+  slp_node, &gs_info, &dataref_ptr,
+  &vec_of

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]




> On Nov 30, 2021, at 9:14 AM, Peter Bergner  wrote:
> 
> On 11/30/21 2:37 AM, Richard Biener wrote:
>> On Mon, Nov 29, 2021 at 11:56 PM Qing Zhao  wrote:
>> I think that's inconsistent indeed.  Peter, what are "opaque"
>> registers?  rs6000-modes.def suggests
>> that there's __vector_pair and __vector_quad, what's the GIMPLE types
>> for those?  It seems they
>> are either SSA names or expanded to pseudo registers but there's no
>> constants for them.
> 
> The __vector_pair and __vector_quad types are target specific types
> for use with our Matrix-Math-Assist (MMA) unit and they are only
> usable with our associated MMA built-in functions.  What they hold
> is really dependent on which MMA built-ins you use on them.
> You can think of them a generic (and large) vector type where the
> subtype is undefined...or defined by which built-in function you
> happen to be using.
> 
> We do not have any constants defined for them.  How we initialize them
> is either by loading values from memory into them or by zeroing them
> out using the xxsetaccz instruction (only for __vector_quads).

So, looks like that the variable with OPAQUE_TYPE cannot be all explicitly 
initialized even at source code level. 

The only way to initialize such variable (only __vector_quad, not for 
__vector_pairs) at source code level is through call to __builtin_mma_xxsetaccz 
as:

void
foo (__vector_quad *dst)
{
  __vector_quad acc;
  __builtin_mma_xxsetaccz(&acc);
  *dst = acc;
}

Is this the correct understanding?

Is there way to initialize such variable to other values than zero at source 
code level?

Qing

> 
> 
> 
> 
>> Can they be initialized?  I see they can be copied at least.
> 
> __vector_quads can be zero initialized using the __builtin_mma_xxsetaccz()
> built-in function.  We don't have a method (or use case) for zero initializing
> __vector_pairs.


> 
> 
> 
>> If such "things" cannot be initialized they should indeed be exempt
>> from auto-init.  The
>> documentation suggests that they act as bit-bucked but even bit-buckets 
>> should
>> be initializable, thus why exactly does CONST0_RTX not exist for them?
> 
> We used to have CONST0_RTX defined (but nothing else), but we had problems
> with the compiler CSEing the initialization for multiple __vector_quads and
> then copying the values around.  We'd end up with one xxsetaccz instruction
> and copies out of that accumulator register into the other accumulator
> registers.  Copies are VERY expensive, while xxsetaccz's are cheap, so we
> don't want that.  That said, I think a fix I put in to disable fwprop on
> these types may have been the culprit for that problem, so maybe we could
> add the CONST0_RTX back?  I'd have to verify that.  If so, then we'd at least
> be able to support -ftrivial-auto-var-init=zero.  The =pattern version
> would be more problematical...unless the value for pattern was loaded from
> memory.
> 
> Peter
> 
>

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]



> On Nov 30, 2021, at 2:37 AM, Richard Biener  
> wrote:
> 
> On Mon, Nov 29, 2021 at 11:56 PM Qing Zhao  wrote:
>> 
>> Peter,
>> 
>> Thanks a lot for the patch.
>> 
>> Richard, how do you think of the patch?
>> 
>> (The major concern for me is:
>> 
>>With the current patch proposed by Peter, we will generate the call 
>> to .DEFERRED_INIT for a variable with OPAQUE_TYPE during gimplification 
>> phase,
>> However, if this variable is in register, then the call to 
>> .DEFERRED_INIT will NOT be expanded during RTL expansion phase.  This 
>> unexpanded call to .DEFERRED_INIT might cause some potential IR issue later?
> 
> I think that's inconsistent indeed.

Can we treat the call to .DEFERRED_INIT to a NOP during expansion phase if we 
cannot expand it to a valid RTL for the OPAQUE_TYPE?  Will doing this resolve 
the issues?

>  Peter, what are "opaque"
> registers?  rs6000-modes.def suggests
> that there's __vector_pair and __vector_quad, what's the GIMPLE types
> for those?  It seems they
> are either SSA names or expanded to pseudo registers but there's no
> constants for them.
> 
>> 
>> If the above is a real issue, should we skip initialization for all 
>> OPAQUE_TYPE variables even when they are in memory and can be initialized 
>> with memset?
>>then we should update “is_var_need_auto_init” in gimplify.c 
>> instead.   However, the issue with this approach is, we might miss the 
>> opportunity to initialize an OPAQUE_TYPE variable if it will be in memory?
>> ).
> 
> I think we need to bite the bullet at some point to do register initialization
> not via expand_assignment but directly based on what the LHS expands to.

OPAQUE_TYPE is so special, it should not be the reason to rewrite the register 
initialization from my understanding. 
If later more issue exposed, it might be necessary to rewrite this part.

Qing
> 
> Can they be initialized?  I see they can be copied at least.
> 
> If such "things" cannot be initialized they should indeed be exempt
> from auto-init.  The
> documentation suggests that they act as bit-bucked but even bit-buckets should
> be initializable, thus why exactly does CONST0_RTX not exist for them?
> 
> Richard.
> 
> 
>> 
>> Thanks.
>> 
>> Qing
>> 
>> 
>>> On Nov 29, 2021, at 3:56 PM, Peter Bergner  wrote:
>>> 
>>> Sorry for dropping the ball on testing the patch from the bugzilla!
>>> 
>>> The following patch fixes the ICE reported in the bugzilla on the 
>>> pre-existing
>>> gcc testsuite test case, bootstraps and shows no testsuite regressions
>>> on powerpc64le-linux.  Ok for trunk?
>>> 
>>> Peter
>>> 
>>> 
>>> For -ftrivial-auto-var-init=*, skip initializing the register variable if it
>>> is an opaque type, because CONST0_RTX(mode) is not defined for opaque modes.
>>> 
>>> gcc/
>>>  PR middle-end/103127
>>>  * internal-fn.c (expand_DEFERRED_INIT): Skip if VAR_TYPE is opaque.
>>> 
>>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
>>> index 0cba95411a6..7cc0e9d5293 100644
>>> --- a/gcc/internal-fn.c
>>> +++ b/gcc/internal-fn.c
>>> @@ -3070,6 +3070,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>>>}
>>>  else
>>>{
>>> +  /* Skip variables of opaque types that are in a register.  */
>>> +  if (OPAQUE_TYPE_P (var_type))
>>> + return;
>>> +
>>>  /* If this variable is in a register use expand_assignment.
>>>   For boolean scalars force zero-init.  */
>>>  tree init;
>>

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

On 11/30/21 11:51 AM, Qing Zhao wrote:
> So, looks like that the variable with OPAQUE_TYPE cannot be all explicitly 
> initialized even at source code level. 
> 
> The only way to initialize such variable (only __vector_quad, not for 
> __vector_pairs) at source code level is through call to 
> __builtin_mma_xxsetaccz as:
> 
> void
> foo (__vector_quad *dst)
> {
>   __vector_quad acc;
>   __builtin_mma_xxsetaccz(&acc);
>   *dst = acc;
> }
> 
> Is this the correct understanding?

Correct.  Or via...

> Is there way to initialize such variable to other values than zero at source 
> code level?

Not for any constant values.  You can load it from memory though like below,
which is also allowed for __vector_pair:

void
foo (__vector_quad *dst, __vector_quad *src)
{
  __vector_quad acc;
  acc = *src;
  ...
}
void
bar (__vector_pair *dst, __vector_pair *src)
{
  __vector_pair pair;
  pair = *src;
  ...
}

We do not accept things like:

  acc = 0;
  acc = {0, 0, ... };
  etc.

Peter

Re: [PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

2021-11-30 Thread Segher Boessenkool

Hi!

On Tue, Nov 30, 2021 at 04:46:34PM +0800, HAO CHEN GUI wrote:
>     This patch modifies the combine pattern with a helper - 
> change_pseudo_and_mask when recog fails. The helper converts a single pseudo 
> to the pseudo and with a mask if the outer operator is IOR/XOR/PLUS and the 
> inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match shift + ior 
> pattern.
> 
>     Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. 
> Is this okay for trunk? Any recommendations? Thanks a lot.

(Please make shorter lines in email.  70 chars is usual).

> gcc/
>     * combine.c (change_pseudo_and_mask): New.
>     (recog_for_combine): If recog fails, try again with the pattern
>     modified by change_pseudo_and_mask.
> 
> gcc/testsuite/
>     * gcc.target/powerpc/20050603-3.c: Modify the dump check conditions.
>     * gcc.target/powerpc/rlwimi-2.c: Likewise.

> +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is
> +   ASHIFT/LSHIFTRT/AND, convert a psuedo to psuedo AND with a mask if its
> +   nonzero_bits is less than its mode mask.  */

Please add some words *why* we do this (namely, because you cannot use
nonzero_bits in combine as well as after combine and expect the same
answer).

> +static bool
> +change_pseudo_and_mask (rtx pat)
> +{
> +  bool changed = false;
> +
> +  rtx src = SET_SRC (pat);
> +  if ((GET_CODE (src) == IOR
> +   || GET_CODE (src) == XOR
> +   || GET_CODE (src) == PLUS)
> +  && (((GET_CODE (XEXP (src, 0)) == ASHIFT
> +   || GET_CODE (XEXP (src, 0)) == LSHIFTRT
> +   || GET_CODE (XEXP (src, 0)) == AND)
> +  && REG_P (XEXP (src, 1)))
> + || ((GET_CODE (XEXP (src, 1)) == ASHIFT
> +  || GET_CODE (XEXP (src, 1)) == LSHIFTRT
> +  || GET_CODE (XEXP (src, 1)) == AND)
> + && REG_P (XEXP (src, 0)

If one arm is a pseudo and the other is compound, the compound one is
first always.  This is one of those canonicalisations that simplifies a
lot of code -- including this new code :-)

> +    {
> +  rtx *reg = REG_P (XEXP (src, 0))
> +    ? &XEXP (SET_SRC (pat), 0)
> +    : &XEXP (SET_SRC (pat), 1);

This is indented wrong.  But, in fact, all tabs are changed to spaces in
your patch?

> @@ -11586,7 +11622,14 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx 
> *pnotes)
>     }
>     }
>    else
> -   changed = change_zero_ext (pat);
> +   {
> + if (change_pseudo_and_mask (pat))
> +   {
> + maybe_swap_commutative_operands (SET_SRC (pat));
> + changed = true;
> +   }
> + changed |= change_zero_ext (pat);
> +   }
>  }
>    else if (GET_CODE (pat) == PARALLEL)
>  {


  changed = change_zero_ext (pat);
  if (!changed)
changed = change_pseudo_and_mask (pat);

  if (changed)
maybe_swap_commutative_operands (SET_SRC (pat));


> --- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> @@ -12,7 +12,7 @@ void rotins (unsigned int x)
>    b.y = (x<<12) | (x>>20);
>  }
> 
> -/* { dg-final { scan-assembler-not {\mrlwinm} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */
>  /* { dg-final { scan-assembler-not {\mrldic} } } */
>  /* { dg-final { scan-assembler-not {\mrot[lr]} } } */
>  /* { dg-final { scan-assembler-not {\ms[lr][wd]} } } */

Please show the -m32 code before and after the change?  Why is it okay
to get an rlwinm there?

> diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c 
> b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> index bafa371db73..ffb5f9e450f 100644
> --- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> @@ -2,14 +2,14 @@
>  /* { dg-options "-O2" } */
> 
>  /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 14121 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 20217 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 21279 { target lp64 } } 
> } */

No, it is not okay to generate worse code.  In what cases do you see
more insns now, and why?

>  /* { dg-final { scan-assembler-times {(?n)^\s+blr} 6750 } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 643 { target ilp32 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 11 { target lp64 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 7790 { target lp64 } } 
> } */
> 
>  /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target lp64 } } 
> } */
> 
>  /* { dg-final { scan-assembler-times {(?n)^\s+mulli} 5036 } } */

Are the new rlwimi's good to have, or can we do those with simpler or
fewer insns?


Segher

[PATCH] libcpp: Enable P1949R7 for C++98 too [PR100977]

On Mon, Nov 29, 2021 at 05:53:58PM -0500, Jason Merrill wrote:
> I'm inclined to go ahead and change C++98 as well; I doubt anyone is relying
> on the particular C++98 extended character set rules, and we already accept
> the union of the different sets when not pedantic.

Ok, here is an incremental patch to do that also for -std={c,gnu}++98.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-11-30  Jakub Jelinek  

* init.c (struct lang_flags): Remove cxx23_identifiers.
(lang_defaults): Remove cxx23_identifiers initializers.
(cpp_set_lang): Don't copy cxx23_identifiers.
* include/cpplib.h (struct cpp_options): Adjust comment about
c11_identifiers.  Remove cxx23_identifiers field.
* lex.c (warn_about_normalization): Use cplusplus instead of
cxx23_identifiers.
* charset.c (ucn_valid_in_identifier): Likewise.

* g++.dg/cpp/ucnid-1.C: Adjust expected diagnostics.
* g++.dg/cpp/ucnid-1-utf8.C: Likewise.

--- gcc/init.c.jj   2021-11-29 22:54:46.503750631 +0100
+++ gcc/init.c  2021-11-30 01:06:31.704473882 +0100
@@ -82,7 +82,6 @@ struct lang_flags
   char extended_numbers;
   char extended_identifiers;
   char c11_identifiers;
-  char cxx23_identifiers;
   char std;
   char digraphs;
   char uliterals;
@@ -100,31 +99,31 @@ struct lang_flags
 };
 
 static const struct lang_flags lang_defaults[] =
-{ /*  c99 c++ xnum xid c11 c++23 std digr ulit rlit udlit bincst 
digsep trig u8chlit vaopt scope dfp szlit elifdef */
-  /* GNUC89   */  { 0,  0,  1,  0,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC99   */  { 1,  0,  1,  1,  0,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC11   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC17   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC2X   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,1, 
1, 0,   1,  1,   1, 1,   0,   1 },
-  /* STDC89   */  { 0,  0,  0,  0,  0,  0,1,  0,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC94   */  { 0,  0,  0,  0,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC99   */  { 1,  0,  1,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC11   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC17   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC2X   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,1, 
1, 1,   1,  0,   1, 1,   0,   1 },
-  /* GNUCXX   */  { 0,  1,  1,  1,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX98*/  { 0,  1,  0,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX11 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX11*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX14 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX14*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX17 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX17*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  0,   1, 0,   0,   0 },
-  /* GNUCXX20 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX20*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* GNUCXX23 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
-  /* CXX23*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
-  /* ASM  */  { 0,  0,  1,  0,  0,  0,0,  0,   0,   0,   0,0, 
0, 0,   0,  0,   0, 0,   0,   0 }
+{ /*  c99 c++ xnum xid c11 std digr ulit rlit udlit bincst digsep 
trig u8chlit vaopt scope dfp szlit elifdef */
+  /* GNUC89   */  { 0,  0,  1,  0,  0,  0,  1,   0,   0,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
+  /* GNUC99   */  { 1,  0,  1,  1,  0,  0,  1,   1,   1,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
+  /* GNUC11   */  { 1,  0,  1,

Re: [PATCH] libcpp: Enable P1949R7 for C++98 too [PR100977]


On 11/30/21 13:19, Jakub Jelinek wrote:

On Mon, Nov 29, 2021 at 05:53:58PM -0500, Jason Merrill wrote:

I'm inclined to go ahead and change C++98 as well; I doubt anyone is relying
on the particular C++98 extended character set rules, and we already accept
the union of the different sets when not pedantic.


Ok, here is an incremental patch to do that also for -std={c,gnu}++98.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2021-11-30  Jakub Jelinek  

* init.c (struct lang_flags): Remove cxx23_identifiers.
(lang_defaults): Remove cxx23_identifiers initializers.
(cpp_set_lang): Don't copy cxx23_identifiers.
* include/cpplib.h (struct cpp_options): Adjust comment about
c11_identifiers.  Remove cxx23_identifiers field.
* lex.c (warn_about_normalization): Use cplusplus instead of
cxx23_identifiers.
* charset.c (ucn_valid_in_identifier): Likewise.

* g++.dg/cpp/ucnid-1.C: Adjust expected diagnostics.
* g++.dg/cpp/ucnid-1-utf8.C: Likewise.

--- gcc/init.c.jj   2021-11-29 22:54:46.503750631 +0100
+++ gcc/init.c  2021-11-30 01:06:31.704473882 +0100
@@ -82,7 +82,6 @@ struct lang_flags
char extended_numbers;
char extended_identifiers;
char c11_identifiers;
-  char cxx23_identifiers;
char std;
char digraphs;
char uliterals;
@@ -100,31 +99,31 @@ struct lang_flags
  };
  
  static const struct lang_flags lang_defaults[] =

-{ /*  c99 c++ xnum xid c11 c++23 std digr ulit rlit udlit bincst 
digsep trig u8chlit vaopt scope dfp szlit elifdef */
-  /* GNUC89   */  { 0,  0,  1,  0,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC99   */  { 1,  0,  1,  1,  0,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC11   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC17   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC2X   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,1, 
1, 0,   1,  1,   1, 1,   0,   1 },
-  /* STDC89   */  { 0,  0,  0,  0,  0,  0,1,  0,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC94   */  { 0,  0,  0,  0,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC99   */  { 1,  0,  1,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC11   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC17   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC2X   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,1, 
1, 1,   1,  0,   1, 1,   0,   1 },
-  /* GNUCXX   */  { 0,  1,  1,  1,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX98*/  { 0,  1,  0,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX11 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX11*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX14 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX14*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX17 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX17*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  0,   1, 0,   0,   0 },
-  /* GNUCXX20 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX20*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* GNUCXX23 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
-  /* CXX23*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
-  /* ASM  */  { 0,  0,  1,  0,  0,  0,0,  0,   0,   0,   0,0, 
0, 0,   0,  0,   0, 0,   0,   0 }
+{ /*  c99 c++ xnum xid c11 std digr ulit rlit udlit bincst digsep 
trig u8chlit vaopt scope dfp szlit elifdef */
+  /* GNUC89   */  { 0,  0,  1,  0,  0,  0,  1,   0,   0,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
+  /* GNUC99   */  { 1,  0,  1,  1,  0,  0,  1,   1,   1,   0,0, 0, 
0,   0,  1,

Re: [PATCH] Fix alignment of stack slots for overaligned types [PR103500]

2021-11-30 Thread Florian Weimer via Gcc-patches

* Alex Coplan via Gcc-patches:

> Bootstrapped and regtested on aarch64-linux-gnu, x86_64-linux-gnu, and
> arm-linux-gnueabihf: no regressions.
>
> I'd appreciate any feedback. Is it OK for trunk?

Does this need an ABI warning?

Thanks,
Florian

Re: [PATCH] PR fortran/101565 - ICE in gfc_simplify_image_index, at fortran/simplify.c:8234

2021-11-30 Thread Harald Anlauf via Gcc-patches


Hi Mikael,

Am 30.11.21 um 12:25 schrieb Mikael Morin:

Hello,

Le 29/11/2021 à 22:31, Harald Anlauf via Fortran a écrit :

Dear all,

a trivial one: we need to check the type of the SUB argument
to the coarray IMAGE_INDEX intrinsic.  It has to be an array
of type integer.

Patch by Steve Kargl.


I hope at some point he’ll finally come to a working git workflow.


Initially I had to rethink my workflow habits when switching from
svn to git.  But after a steep learning curve I wouldn't want to
go back.  One day Steve might see it same way.


Regtested on x86_64-pc-linux-gnu.  OK for mainline?


Sure.



Thanks,
Harald

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]



> On Nov 30, 2021, at 12:08 PM, Peter Bergner  wrote:
> 
> On 11/30/21 11:51 AM, Qing Zhao wrote:
>> So, looks like that the variable with OPAQUE_TYPE cannot be all explicitly 
>> initialized even at source code level. 
>> 
>> The only way to initialize such variable (only __vector_quad, not for 
>> __vector_pairs) at source code level is through call to 
>> __builtin_mma_xxsetaccz as:
>> 
>> void
>> foo (__vector_quad *dst)
>> {
>>  __vector_quad acc;
>>  __builtin_mma_xxsetaccz(&acc);
>>  *dst = acc;
>> }
>> 
>> Is this the correct understanding?
> 
> Correct.  Or via...
> 
> 
>> Is there way to initialize such variable to other values than zero at source 
>> code level?
> 
> Not for any constant values.  You can load it from memory though like below,
> which is also allowed for __vector_pair:
> 
> void
> foo (__vector_quad *dst, __vector_quad *src)
> {
>  __vector_quad acc;
>  acc = *src;
>  ...
> }
> void
> bar (__vector_pair *dst, __vector_pair *src)
> {
>  __vector_pair pair;
>  pair = *src;
>  ...
> }

However, even with the above, the memory pointed by “src” still need to be 
initialized somewhere. How to provide the initial value to the variable in the 
beginning for __vector_pair type?

Qing
> 
> We do not accept things like:
> 
>  acc = 0;
>  acc = {0, 0, ... };
>  etc.
> 
> Peter

[r12-5612 Regression] FAIL: gcc.target/i386/pr88531-1a.c (test for excess errors) on Linux/x86_64

2021-11-30 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

10833849b55401a52f2334eb032a70beb688e9fc is the first bad commit
commit 10833849b55401a52f2334eb032a70beb688e9fc
Author: Richard Sandiford 
Date:   Tue Nov 30 09:52:29 2021 +

vect: Support gather loads with SLP

caused

FAIL: gcc.target/i386/pr88531-1a.c (internal compiler error)
FAIL: gcc.target/i386/pr88531-1a.c (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-5612/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr88531-1a.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr88531-1a.c --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-30 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

Am 30.11.21 um 18:24 schrieb Tobias Burnus:

On 29.11.21 22:11, Harald Anlauf wrote:

"A whole array is a named array or a structure component whose final
part-ref is an array component name; no subscript list is appended."

I think in "h(3)" there is not really a named array – thus I read it as
if the "Otherwise ... result value is 1" applies.

If you read on in the standard:

"The appearance of a whole array variable in an executable construct
specifies all the elements of the array ..."

which might make you/makes me think that the sentence before that one
could need an official interpretation...

I am not sure whether I understand what part of the spec you wonder
about. (I mean besides that 'variable' can also mean referencing a
data-pointer-returning function.)

strictly speaking you're now talking about the text for LBOUND,
and your quote is not from the standard section about the ALLOCATE
statement. And there are several places in the standard document
where there is an explicit reference to LBOUND when talking about
what the bounds should be. This is why I am unhappy with the text
about ALLOCATE, not about LBOUND.

Question: What do NAG/flang/... report for lbound(h(3)) - also [3] – or
[1] as gfortran?

I've submitted a reduced example to the Intel Fortran Forum:
https://community.intel.com/t5/Intel-Fortran-Compiler/Allocate-with-SOURCE-and-bounds/m-p/1339992#M158535

There are good chances that Steve Lionel reads and comments on it.

So far only "FortranFan" has replied – and he comes to the same
conclusion as my reading, albeit without referring to the standard.

You seem to be quite convinced with your interpretation,
while I am simply confused.

So go ahead and apply to mainline. Let's see if we learn more.
I do hope I will.

Harald

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
Registergericht München, HRB 106955

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-30 Thread Toon Moene


On 11/30/21 8:54 PM, Harald Anlauf via Fortran wrote:


Hi Tobias,



You seem to be quite convinced with your interpretation,
while I am simply confused.


If both compiler developers are confused, and actual compiler 
implementations differ in their outcomes of the test case, IMNSHO it is 
time to ask the Fortran Standardization Committee for an interpretation 
(of the standard's text).


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

On 11/30/21 1:50 PM, Qing Zhao via Gcc-patches wrote:
>> void
>> bar (__vector_pair *dst, __vector_pair *src)
>> {
>>  __vector_pair pair;
>>  pair = *src;
>>  ...
>> }
> 
> However, even with the above, the memory pointed by “src” still need to
> be initialized somewhere. How to provide the initial value to the variable
> in the beginning for __vector_pair type?

Well no initialization is required here in this function.  Isn't that what
matters here?  When generating code for bar(), we assume that src already
points to initialized memory.

As for what src points to, that could be initialized how any other memory/
array could be initialized, so either a static array, read in some data
from a file into an array, compute the array values in a loop, etc. etc.

Peter

[committed] libstdc++: Make Asan detection work for Clang [PR103453]

Tested x86_64-linux, pushed to trunk.


Clang doesn't define __SANITIZE_ADDRESS__ so use its __has_feature check
to detect Asan instead.

libstdc++-v3/ChangeLog:

PR libstdc++/103453
* config/allocator/malloc_allocator_base.h
(_GLIBCXX_SANITIZE_STD_ALLOCATOR): Define for Clang.
* config/allocator/new_allocator_base.h
(_GLIBCXX_SANITIZE_STD_ALLOCATOR): Likewise.
---
 libstdc++-v3/config/allocator/malloc_allocator_base.h | 10 --
 libstdc++-v3/config/allocator/new_allocator_base.h| 10 --
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/config/allocator/malloc_allocator_base.h 
b/libstdc++-v3/config/allocator/malloc_allocator_base.h
index d7b56e3c9ef..b798d3fd448 100644
--- a/libstdc++-v3/config/allocator/malloc_allocator_base.h
+++ b/libstdc++-v3/config/allocator/malloc_allocator_base.h
@@ -52,8 +52,14 @@ namespace std
 # define __allocator_base  __gnu_cxx::malloc_allocator
 #endif
 
-#if defined(__SANITIZE_ADDRESS__) && !defined(_GLIBCXX_SANITIZE_STD_ALLOCATOR)
-# define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+#ifndef _GLIBCXX_SANITIZE_STD_ALLOCATOR
+# if defined(__SANITIZE_ADDRESS__)
+#  define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+# elif defined __has_feature
+#  if __has_feature(address_sanitizer)
+#   define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+#  endif
+# endif
 #endif
 
 #endif
diff --git a/libstdc++-v3/config/allocator/new_allocator_base.h 
b/libstdc++-v3/config/allocator/new_allocator_base.h
index 77ee8b73979..7c52fef63de 100644
--- a/libstdc++-v3/config/allocator/new_allocator_base.h
+++ b/libstdc++-v3/config/allocator/new_allocator_base.h
@@ -52,8 +52,14 @@ namespace std
 # define __allocator_base  __gnu_cxx::new_allocator
 #endif
 
-#if defined(__SANITIZE_ADDRESS__) && !defined(_GLIBCXX_SANITIZE_STD_ALLOCATOR)
-# define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+#ifndef _GLIBCXX_SANITIZE_STD_ALLOCATOR
+# if defined(__SANITIZE_ADDRESS__)
+#  define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+# elif defined __has_feature
+#  if __has_feature(address_sanitizer)
+#   define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+#  endif
+# endif
 #endif
 
 #endif
-- 
2.31.1

[committed] libstdc++: Skip tag dispatching for _S_relocate in C++17

Tested x86_64-linux, pushed to trunk.


In C++17 mode all callers of _S_relocate have already done:

  if constexpr (_S_use_relocate())

so we don't need to repeat that check and use tag dispatching to avoid
ill-formed instantiations.

libstdc++-v3/ChangeLog:

* include/bits/stl_vector.h (vector::_S_do_relocate): Remove
C++20 constexpr specifier.
(vector::_S_relocate) [__cpp_if_constexpr]: Call __relocate_a
directly without tag dispatching.
---
 libstdc++-v3/include/bits/stl_vector.h | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index 4587757637e..36b2cff3d78 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -481,14 +481,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
return _S_nothrow_relocate(__is_move_insertable<_Tp_alloc_type>{});
   }
 
-  static _GLIBCXX20_CONSTEXPR pointer
+  static pointer
   _S_do_relocate(pointer __first, pointer __last, pointer __result,
 _Tp_alloc_type& __alloc, true_type) noexcept
   {
return std::__relocate_a(__first, __last, __result, __alloc);
   }
 
-  static _GLIBCXX20_CONSTEXPR pointer
+  static pointer
   _S_do_relocate(pointer, pointer, pointer __result,
 _Tp_alloc_type&, false_type) noexcept
   { return __result; }
@@ -497,8 +497,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _S_relocate(pointer __first, pointer __last, pointer __result,
  _Tp_alloc_type& __alloc) noexcept
   {
+#if __cpp_if_constexpr
+   // All callers have already checked _S_use_relocate() so just do it.
+   return std::__relocate_a(__first, __last, __result, __alloc);
+#else
using __do_it = __bool_constant<_S_use_relocate()>;
return _S_do_relocate(__first, __last, __result, __alloc, __do_it{});
+#endif
   }
 #endif // C++11
 
-- 
2.31.1

Re: [PATCH] c++, v2: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4


On 11/30/21 07:17, Jakub Jelinek wrote:

On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote:

It's a DR.  Really, it was intended to be part of C++20; at the Cologne
meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix
could go in as part of that paper.


Ok, changed to be done unconditionally now.


Also, allowing indeterminate values that are never read was in C++20
(P1331).


Reading P1331R2 again, I'm still puzzled.
Our current behavior (both before and after this patch) is that if
some variable is scalar and has indeterminate value or if an aggregate
variable has some members (possibly nested) with indeterminate values,
in constexpr contexts we allow copying those into other vars of the
same type (e.g. the testcases in the patch below test mere copying
of the whole structures or unsigned char result of __builtin_bit_cast),


That seems to be a bug, since the copy involves an lvalue-to-rvalue 
conversion.



but we reject if we actually use them in some other way (e.g. try to
read a member from a variable that has that member indeterminate,
see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an
unsigned char variable.


That's correct.


Then there is P1331R2 which makes the UB on
"an lvalue-to-rvalue conversion that is applied to an object with
indeterminate value ([basic.indet]);"
but isn't even the
   unsigned char a = __builtin_bit_cast (unsigned char, u);
   unsigned char b = a;
case non-constant then when __builtin_bit_cast returns indeterminate value?


Good point.  So it would seem to follow that if the output is going to 
have an indeterminate value, it's non-constant, we don't have to work 
hard in constexpr evaluation, and f1-4 are all non-constant.  And the 
new bit_cast text is only interesting for non-constant evaluation.



__builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens
in that case, so supposely
   unsigned char a = __builtin_bit_cast (unsigned char, u);
is fine, but on


Eh, there's clearly an lvalue-rvalue conversion involved in reading from 
the source value.



   unsigned char b = a;
a is lvalue and is converted to rvalue.
Similarly
   T t = { 1, 2 };
   S s = __builtin_bit_cast (S, t);
   S u = s;
where S s = __builtin_bit_cast (S, t); could be ok even when some or all
members are indeterminate, but u = s; does lvalue-to-rvalue conversion?



Or there is http://eel.is/c++draft/basic.indet that has quite clear rules
what is and isn't UB and if C++ wanted to go further and allow all those
valid cases in there as constant...

Anyway, I hope this can be dealt with incrementally.



I think in all of them the result of the cast has (some) indeterminate
value.  So f1-3 are OK because the indeterminate value has unsigned char
type and is never used; f4() is non-constant because S::f has
non-byte-access type and so the new wording says it's undefined.


Ok, implemented the bitfield handling then.

Here is an updated patch, so far lightly tested.

2021-11-30  Jakub Jelinek 

* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
(cxx_eval_bit_cast): Don't error about padding bits if target
type is unsigned char or std::byte, instead return no clearing
ctor.  Use clear_uchar_or_std_byte_in_mask.

* g++.dg/cpp2a/bit-cast11.C: New test.
* g++.dg/cpp2a/bit-cast12.C: New test.
* g++.dg/cpp2a/bit-cast13.C: New test.
* g++.dg/cpp2a/bit-cast14.C: New test.

--- gcc/cp/constexpr.c.jj   2021-11-30 09:44:46.531607444 +0100
+++ gcc/cp/constexpr.c  2021-11-30 12:20:29.105251443 +0100
@@ -4268,6 +4268,121 @@ check_bit_cast_type (const constexpr_ctx
return false;
  }
  
+/* Helper function for cxx_eval_bit_cast.  For unsigned char or

+   std::byte members of CONSTRUCTOR (recursively) if they contain
+   some indeterminate bits (as set in MASK), remove the ctor elts,
+   mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the
+   bits in MASK.  */
+
+static void
+clear_uchar_or_std_byte_in_mask (location_t loc, tree t, unsigned char *mask)
+{
+  if (TREE_CODE (t) != CONSTRUCTOR)
+return;
+
+  unsigned i, j = 0;
+  tree index, value;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value)
+{
+  tree type = TREE_TYPE (value);
+  if (TREE_CODE (TREE_TYPE (t)) != ARRAY_TYPE
+ && DECL_BIT_FIELD_TYPE (index) != NULL_TREE)
+   {
+ if (is_byte_access_type (DECL_BIT_FIELD_TYPE (index))
+ && (TYPE_MAIN_VARIANT (DECL_BIT_FIELD_TYPE (index))
+ != char_type_node))
+   {
+ HOST_WIDE_INT fldsz = TYPE_PRECISION (TREE_TYPE (index));
+ gcc_assert (fldsz != 0);
+ HOST_WIDE_INT pos = int_byte_position (index);
+ HOST_WIDE_INT bpos
+   = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (index));
+ bpos %= BITS_PER_UNIT;
+ HOST_WIDE_INT end
+   = ROUND_UP (bpos + fldsz, BITS_PER_UNIT) / BITS_PER_UNIT;
+

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

Sorry for the confusing…
My major question is:  

for a variable of type __vector_pair,  could it be in a register?
If it could be in a register, can we initialize this register with some 
constant value? 

Qing

> On Nov 30, 2021, at 2:07 PM, Peter Bergner  wrote:
> 
> On 11/30/21 1:50 PM, Qing Zhao via Gcc-patches wrote:
>>> void
>>> bar (__vector_pair *dst, __vector_pair *src)
>>> {
>>> __vector_pair pair;
>>> pair = *src;
>>> ...
>>> }
>> 
>> However, even with the above, the memory pointed by “src” still need to
>> be initialized somewhere. How to provide the initial value to the variable
>> in the beginning for __vector_pair type?
> 
> Well no initialization is required here in this function.  Isn't that what
> matters here?  When generating code for bar(), we assume that src already
> points to initialized memory.
> 
> As for what src points to, that could be initialized how any other memory/
> array could be initialized, so either a static array, read in some data
> from a file into an array, compute the array values in a loop, etc. etc.
> 
> Peter
>

Re: [PATCH, fortran] Improve expansion of constant array expressions within constructors


Hello,

On 27/11/2021 21:56, Harald Anlauf via Fortran wrote:

diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c
index 6552eaf3b0c..fbc66097c80 100644
--- a/gcc/fortran/array.c
+++ b/gcc/fortran/array.c
@@ -1804,6 +1804,12 @@ expand_constructor (gfc_constructor_base base)
   if (empty_constructor)
empty_ts = e->ts;

+  /* Simplify constant array expression/section within constructor.  */
+  if (e->expr_type == EXPR_VARIABLE && e->rank > 0 && e->ref
+ && e->symtree && e->symtree->n.sym
+ && e->symtree->n.sym->attr.flavor == FL_PARAMETER)
+   gfc_simplify_expr (e, 0);
+
   if (e->expr_type == EXPR_ARRAY)
{
  if (!expand_constructor (e->value.constructor))


There is another simplification call just a few lines below, that I 
thought could just be moved up.
But it works on a copy of the expression, and managing the copy makes it 
complex as well, so let’s do it your way.


OK.

[PATCH] c++: don't fold away 'if' with constant condition

richi's recent unreachable code warning experiments had trouble with the C++
front end folding away an 'if' with a constant condition.  Let's do less
folding at the statement level.  Thanks to Marek for finding the offending
code.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* cp-gimplify.c (genericize_if_stmt): Always build a COND_EXPR.
---
 gcc/cp/cp-gimplify.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 0988655eeba..0a002db14e7 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -166,11 +166,8 @@ genericize_if_stmt (tree *stmt_p)
  can contain unfolded immediate function calls, we have to discard
  the then_ block regardless of whether else_ has side-effects or not.  */
   if (IF_STMT_CONSTEVAL_P (stmt))
-stmt = else_;
-  else if (integer_nonzerop (cond) && !TREE_SIDE_EFFECTS (else_))
-stmt = then_;
-  else if (integer_zerop (cond) && !TREE_SIDE_EFFECTS (then_))
-stmt = else_;
+stmt = build3 (COND_EXPR, void_type_node, boolean_false_node,
+  void_node, else_);
   else
 stmt = build3 (COND_EXPR, void_type_node, cond, then_, else_);
   protected_set_expr_location_if_unset (stmt, locus);

base-commit: 92de188ea3d36ec012b6d42959d4722e42524256
-- 
2.27.0

[committed] wwwdocs: gcc--12: Tweak language in the Fortran section

2021-11-30 Thread Gerald Pfeifer

Pushed.

---
 htdocs/gcc-12/changes.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 10ac025f..45a8d99a 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -322,7 +322,7 @@ a work-in-progress.
 GCC 12 now uses OPERATION as the name of the function to
 the CO_REDUCE intrinsic for the pairwise reduction, thus
 conforming to the Fortran 2018 standard.  Previous versions
-used OPERATOR, which conformed to TS 18508.
+used OPERATOR which conforms to TS 18508.
   
 
 
-- 
2.34.0

[committed] wwwdocs: gcc-4.7: Update reference to Go 1 language standard

2021-11-30 Thread Gerald Pfeifer

Just a trivial, if permanent redirect, to follow.

Pushed, Gerald

---
 htdocs/gcc-4.7/changes.html | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/htdocs/gcc-4.7/changes.html b/htdocs/gcc-4.7/changes.html
index 21294cc3..846946d6 100644
--- a/htdocs/gcc-4.7/changes.html
+++ b/htdocs/gcc-4.7/changes.html
@@ -1017,8 +1017,7 @@ complete (that is, it is possible that some PRs that have 
been fixed
 are not listed here).
 
 The Go front end in the 4.7.1 release fully supports
-the https://golang.org/doc/go1";>Go 1 language
-standard.
+the https://go.dev/doc/go1";>Go 1 language standard.
 
 GCC 4.7.2
 
-- 
2.34.0

[committed] wwwdocs: readings: Switch the DWARF Workgroup to https

2021-11-30 Thread Gerald Pfeifer

While we are at it, remove the unnecessary trailing slash.

Pushed, Gerald

---
 htdocs/readings.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/readings.html b/htdocs/readings.html
index e75bfc49..12755d7e 100644
--- a/htdocs/readings.html
+++ b/htdocs/readings.html
@@ -595,7 +595,7 @@ names.
   http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf";>System
   V PowerPC ABI
 
-  http://dwarfstd.org/";>DWARF Workgroup
+  https://dwarfstd.org";>DWARF Workgroup
 
 
 
-- 
2.34.0

Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift

2021-11-30 Thread Andrew Pinski via Gcc-patches

On Tue, Nov 30, 2021 at 8:35 AM Navid Rahimi via Gcc-patches
 wrote:
>
> Hi GCC community,
>
> This patch will add the missed pattern described in bug 98956 [1] to the 
> match.pd. The codegen and correctness proof for this pattern is here [2,3] in 
> case anyone is curious. Tested on x86_64 Linux.
>

A better way to optimize this is the following (which I describe in PR 64992):
 take: (t << 1) != 0;

This should be transformed into:
(t & 0x7fff) != 0

The rest will just fall out really.  That is there is no reason to
special case bool here.
I have most of the patch except for creating the mask part which
should be simple, I just did not want to look up the wi:: functions at
the time I was writing it into the bug report.

Thanks,
Andrew Pinski



> Tree-optimization/98956:
>
> Adding new optimization to match.pd:
> * match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization.
> * gcc.dg/tree-ssa/pr98956.c: testcase for this optimization.
> * gcc.dg/tree-ssa/pr98956-2.c: testcase for node with 
> side-effect.
>
> 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98956
> 2) https://compiler-explorer.com/z/nj4PTrecW
> 3) https://alive2.llvm.org/ce/z/jyJAoS
>
> Best wishes,
> Navid.

Re: [PATCH] Avoid some -Wunreachable-code-ctrl


On 11/29/21 10:03, Richard Biener via Gcc-patches wrote:

This cleans up unreachable code diagnosed by -Wunreachable-code-ctrl.
It largely follows the previous series but discovers a few extra
cases, namely dead code after break or continue or loops without
exits.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2021-11-29  Richard Biener  

gcc/c/
* gimple-parser.c (c_parser_gimple_postfix_expression):
avoid unreachable code after break.

gcc/
* cfgrtl.c (skip_insns_after_block): Refactor code to
be more easily readable.
* expr.c (op_by_pieces_d::run): Remove unreachable
assert.
* sched-deps.c (sched_analyze): Remove unreachable
gcc_unreachable.
* sel-sched-ir.c (in_same_ebb_p): Likewise.
* tree-ssa-alias.c (nonoverlapping_refs_since_match_p):
Remove unreachable code.
* tree-vect-slp.c (vectorize_slp_instance_root_stmt):
Refactor to avoid unreachable loop iteration.
* tree.c (walk_tree_1): Remove unreachable break.
* vec-perm-indices.c (vec_perm_indices::series_p): Remove
unreachable return.

gcc/cp/
* parser.c (cp_parser_postfix_expression): Remove
unreachable code.
* pt.c (tsubst_expr): Remove unreachable breaks.

gcc/fortran/
* frontend-passes.c (gfc_expr_walker): Remove unreachable
break.
* scanner.c (skip_fixed_comments): Remove unreachable
gcc_unreachable.
* trans-expr.c (gfc_expr_is_variable): Refactor to make
control flow more obvious.
---
  gcc/c/gimple-parser.c |  8 +---
  gcc/cfgrtl.c  | 10 ++
  gcc/cp/parser.c   |  4 
  gcc/cp/pt.c   |  2 --
  gcc/expr.c|  3 ---
  gcc/fortran/frontend-passes.c |  1 -
  gcc/fortran/scanner.c |  1 -
  gcc/fortran/trans-expr.c  | 11 +++
  gcc/sched-deps.c  |  2 --
  gcc/sel-sched-ir.c|  3 ---
  gcc/tree-ssa-alias.c  |  3 ---
  gcc/tree-vect-slp.c   | 22 --
  gcc/tree.c|  2 --
  gcc/vec-perm-indices.c|  1 -
  14 files changed, 14 insertions(+), 59 deletions(-)

diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index 32f22dbb8a7..f594a8ccb31 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -1698,13 +1698,7 @@ c_parser_gimple_postfix_expression (gimple_parser 
&parser)
}
  break;
}
-  else
-   {
- c_parser_error (parser, "expected expression");
- expr.set_error ();
- break;
-   }
-  break;
+  /* Fallthru.  */
  default:
c_parser_error (parser, "expected expression");
expr.set_error ();
diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 3744adcc2ba..287a3db643a 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -3539,14 +3539,8 @@ skip_insns_after_block (basic_block bb)
  continue;
  
  	case NOTE:

- switch (NOTE_KIND (insn))
-   {
-   case NOTE_INSN_BLOCK_END:
- gcc_unreachable ();
-   default:
- continue;
-   }
- break;
+ gcc_assert (NOTE_KIND (insn) != NOTE_INSN_BLOCK_END);
+ continue;
  
  	case CODE_LABEL:

  if (NEXT_INSN (insn)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 0bd58525726..cc88a36dd39 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -7892,10 +7892,6 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
  return postfix_expression;
}
  }
-
-  /* We should never get here.  */
-  gcc_unreachable ();


Hmm, I generally disagree with removing gcc_unreachable() asserts 
because they are unreachable; it seems like it increases the fragility 
of the code in case later changes wrongly make them reachable.



-  return error_mark_node;
  }
  
  /* Helper function for cp_parser_parenthesized_expression_list and

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 31ed773e145..f4b9d9673fb 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18242,13 +18242,11 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl,
stmt = finish_co_yield_expr (input_location,
   RECUR (TREE_OPERAND (t, 0)));
RETURN (stmt);
-  break;
  
  case CO_AWAIT_EXPR:

stmt = finish_co_await_expr (input_location,
   RECUR (TREE_OPERAND (t, 0)));
RETURN (stmt);
-  break;
  
  case EXPR_STMT:

tmp = RECUR (EXPR_STMT_EXPR (t));
diff --git a/gcc/expr.c b/gcc/expr.c
index 5673902b1fc..b2815257509 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -1342,9 +1342,6 @@ op_by_pieces_d::run ()
}
  }
while (1);
-
-  /* The code above should have handled everything.  */
-  gcc_assert (!length);
  }
  
  /* Derived class from op_by_pieces_d, providing support for block move

diff

[PATCH v2 1/2] add -Wuse-after-free

2021-11-30 Thread Martin Sebor via Gcc-patches


Attached is a revised patch with the following changes based
on your comments:

1) Set and use statement uids to determine which statement
   precedes which in the same basic block.
2) Avoid testing flag_isolate_erroneous_paths_dereference.
3) Use post-dominance to decide whether to use the "maybe"
   phrasing vs a definite form.

David raised (and in our offline discussion today reiterated)
an objection to the default setting of the option being
the strictest.  I have not changed that in this revision.
See my rationale for this choice in my reply below:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583176.html

Martin

On 11/23/21 2:16 PM, Martin Sebor wrote:

On 11/22/21 6:32 PM, Jeff Law wrote:



On 11/1/2021 4:17 PM, Martin Sebor via Gcc-patches wrote:

Patch 1 in the series detects a small subset of uses of pointers
made indeterminate by calls to deallocation functions like free
or C++ operator delete.  To control the conditions the warnings
are issued under the new -Wuse-after-free= option provides three
levels.  At the lowest level the warning triggers only for
unconditional uses of freed pointers and doesn't warn for uses
in equality expressions.  Level 2 warns also for come conditional
uses, and level 3 also for uses in equality expressions.

I debated whether to make level 2 or 3 the default included in
-Wall.  I decided on 3 for two reasons: 1) to raise awareness
of both the problem and GCC's new ability to detect it: using
a pointer after it's been freed, even only in principle, by
a successful call to realloc, is undefined, and 2) because
it's trivial to lower the level either globally, or locally
by suppressing the warning around such misuses.

I've tested the patch on x86_64-linux and by building Glibc
and Binutils/GDB.  It triggers a number of times in each, all
due to comparing invalidated pointers for equality (i.e., level
3).  I have suppressed these in GCC (libiberty) by a #pragma,
and will see how the Glibc folks want to deal with theirs (I
track them in BZ #28521).

The tests contain a number of xfails due to limitations I'm
aware of.  I marked them pr?? until the patch is approved.
I will open bugs for them before committing if I don't resolve
them in a followup.

Martin

gcc-63272-1.diff

Add -Wuse-after-free.

gcc/c-family/ChangeLog

* c.opt (-Wuse-after-free): New options.

gcc/ChangeLog:

* diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Handle
OPT_Wreturn_local_addr and OPT_Wuse_after_free_.
* diagnostic-spec.h (NW_DANGLING): New enumerator.
* doc/invoke.texi (-Wuse-after-free): Document new option.
* gimple-ssa-warn-access.cc (pass_waccess::check_call): Rename...
(pass_waccess::check_call_access): ...to this.
(pass_waccess::check): Rename...
(pass_waccess::check_block): ...to this.
(pass_waccess::check_pointer_uses): New function.
(pass_waccess::gimple_call_return_arg): New function.
(pass_waccess::warn_invalid_pointer): New function.
(pass_waccess::check_builtin): Handle free and realloc.
(gimple_use_after_inval_p): New function.
(get_realloc_lhs): New function.
(maybe_warn_mismatched_realloc): New function.
(pointers_related_p): New function.
(pass_waccess::check_call): Call check_pointer_uses.
(pass_waccess::execute): Compute and free dominance info.

libcpp/ChangeLog:

* files.c (_cpp_find_file): Substitute a valid pointer for
an invalid one to avoid -Wuse-0after-free.

libiberty/ChangeLog:

* regex.c: Suppress -Wuse-after-free.

gcc/testsuite/ChangeLog:

* gcc.dg/Wmismatched-dealloc-2.c: Avoid -Wuse-after-free.
* gcc.dg/Wmismatched-dealloc-3.c: Same.
* gcc.dg/attr-alloc_size-6.c: Disable -Wuse-after-free.
* gcc.dg/attr-alloc_size-7.c: Same.
* c-c++-common/Wuse-after-free-2.c: New test.
* c-c++-common/Wuse-after-free-3.c: New test.
* c-c++-common/Wuse-after-free-4.c: New test.
* c-c++-common/Wuse-after-free-5.c: New test.
* c-c++-common/Wuse-after-free-6.c: New test.
* c-c++-common/Wuse-after-free-7.c: New test.
* c-c++-common/Wuse-after-free.c: New test.
* g++.dg/warn/Wdangling-pointer.C: New test.
* g++.dg/warn/Wmismatched-dealloc-3.C: New test.
* g++.dg/warn/Wuse-after-free.C: New test.

diff --git a/gcc/gimple-ssa-warn-access.cc 
b/gcc/gimple-ssa-warn-access.cc

index 63fc27a1487..2065402a2b9 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc

@@ -3397,33 +3417,460 @@ pass_waccess::maybe_check_dealloc_call 
(gcall *call)

  }
  }
+/* Return true if either USE_STMT's basic block (that of a pointer's 
use)
+   is dominated by INVAL_STMT's (that of a pointer's invalidating 
statement,

+   which is either a clobber or a deallocation call), or if they're in
+   the same block, USE_STMT follows INVAL_STMT.  */
+
+static bool
+gimple_use_after_inval_p (gimple *inval_stmt, gimple *use_stmt,
+  bool last_block = false)
+{
+  tree clobvar =
+    gimple_clobber_p (inval_stmt) ?

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]