date:20210830

C++ patch ping

2021-08-30 Thread Jakub Jelinek via Gcc-patches

Hi!

I'd like to ping the following patches

libcpp: __VA_OPT__ p1042r1 placemarker changes [PR101488]
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575621.html
together with your
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577602.html
incremental patch (successfully tested on x86_64-linux and i686-linux).

libcpp, v2: Implement C++23 P1949R7 - C++ Identifier Syntax using Unicode 
Standard Annex 31 [PR100977]
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576854.html
The incremental patch for splitting tokens at unsupported characters
withdrawn, the above is the base patch.

Thanks.

Jakub

Re: [PATCH take 3] Experimental -fpreserve-traps option

2021-08-30 Thread Richard Biener via Gcc-patches

On Sun, Aug 29, 2021 at 11:28 AM Roger Sayle  wrote:
>
>
> This is another attempt to add an -fpreserve-traps command line
> option to GCC.  Many thanks to Richard Beiner for approving the
> code clean-up pieces of my earlier submission: This revision
> contains just the actual functional change(s).
>
> My recent attempt at humour in comment #8 of PR 77980 has
> reminded me that we still really need to tackle/improve the
> semantics of trapping instructions.
>
> Firstly, I completely agree with Richard (and Eric) that the
> current situation with the semantics of -fexceptions and
> -fnon-call-exceptions (especially the latter without the
> former) is a mess.  Alas, the testsuite (and source) is full
> of checks that the current status quo is preserved, so any
> attempts to tweak exception handling leads to a world of pain.

I'm giving the following a try now which at least would remove
the '-fno-exceptions -fnon-call-exceptions' state (it then might ask
for unification of the flag_ vars to an enum, but that would be for
a followup).

diff --git a/gcc/common.opt b/gcc/common.opt
index ed8ab5fbe13..7d69ab5ef7c 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1509,7 +1509,7 @@ Common Var(flag_emit_class_debug_always) Init(0)
 Do not suppress C++ class debug information.

 fexceptions
-Common Var(flag_exceptions) Optimization
+Common Var(flag_exceptions) Optimization EnabledBy(fnon-call-exceptions)
 Enable exception handling.

 fexpensive-optimizations

> The most relevant thing that I would like to point out is that
> this proposed option, -fpreserve-traps, and the semantics of
> trapping instructions are completely orthogonal to exception
> handling.  As the term EH implies, exceptions are a form of flow
> control (on those targets that support it).  Traps, signals(?),
> segmentation faults, illegal accesses and invalid instructions
> are things that can't (always?) be controlled.  Exceptional yes,
> but not handleable.

Are they really independent?  Consider -fnon-call-exceptions
where trapping instructions (may) raise exceptions - now with
your change (ENOPATCH btw...) any such instructions would
be considered having side-effects despite the 'trapping' bit
now covered by EH IL.  Now - individual stmts can be marked
as known to not throw, what does this now mean for its
trapping state?  How do -fpreserve-traps and -fdelete-dead-exceptions
interact here?

I think that going down the -fpreserve-traps way opens a whole
can of worms with respects to users doing weird things and us
having to maintain sth not well thought out - so IMHO adding
a user-visible option shouldn't be the first thing to do.

IIRC we got -ftrapv because there were frontends requiring
its semantics (Java)?  But even -ftrapv is utterly broken and
only "works" in very small circumstances (SImode and DImode
operations).  But now -ftrapv interacts with -fpreserve-traps
as well given it makes some (not very well defined) operations
possibly trapping.

What does preserving a trap mean btw, does it require to
preserve the exact trapping instruction (a trap handler might
inspect the trapping instruction after all)?  Or are we allowed
to replace a known to be trapping instruction by sth else
that also traps?  I suppose traps will be non-recoverable
(you have -fnon-call-exceptions and EH as the only mechanism
to resume after a trap).

Does -fpreserve-traps imply that a trap is now sth well-defined
and that for example a division by zero is no longer considered
undefined behavior (with all consequences for VRP and thus
removing unused divisions)?  Is any operation triggering
undefined behavior now possibly trapping?

In the end the main question to ask is probably what the
motivation behind adding -fpreserve-traps is?  Is it making
a quite vaguely defined part of GCCs behavior properly
defined?  Is it giving users much needed control about
"preserving traps"?

I hoped the one jumping on the "traps" bandwagon would
have tackled -ftrapv first (thankfully very few references
in GCC itself remain), by nuking flag_trapv from the middle-end
and reflecting -ftrapv to the IL from the two FEs it is exposed
to (C and C++).  We did have this situation with adding some
extra option, only confusing things further, in the past - not
sure if you remember -fstrict-overflow which added a third
kind of overflow behavior.

Thanks,
Richard.

> I also agree with Richard's sentiment that GCC should aim to
> store semantics in the representation, not in global variables.
> Interestingly, whether a tree or rtx can trap has been explicitly
> represented in every allocated node for decades; it's what the
> compiler should do (or is allowed to do) with that information
> that has been missing.
>
> My proposal is to treat -fpreserve-traps as an experiment for
> the time being.  Allowing all the places in the compiler that
> make assumptions about traps to be found/documented (by searching
> for flag_perserve_traps in the source code), and their influence
> on performance be

[PATCH] Make sure -fexceptions is enabled when -fnon-call-exceptions is

2021-08-30 Thread Richard Biener via Gcc-patches

This makes -fexceptions enabled by -fnon-call-exceptions, removing
the odd state of !flag_exceptions && flag_non_call_exceptions from
middle-end consideration.

Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages.

OK?
Richard.

2021-08-30  Richard Biener  

* common.opt (fexceptions): Mark
EnabledBy(fnon-call-exceptions).
* doc/invoke.texi (fnon-call-exceptions): Document this
enables -fexceptions.
---
 gcc/common.opt  | 2 +-
 gcc/doc/invoke.texi | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index ed8ab5fbe13..7d69ab5ef7c 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1509,7 +1509,7 @@ Common Var(flag_emit_class_debug_always) Init(0)
 Do not suppress C++ class debug information.
 
 fexceptions
-Common Var(flag_exceptions) Optimization
+Common Var(flag_exceptions) Optimization EnabledBy(fnon-call-exceptions)
 Enable exception handling.
 
 fexpensive-optimizations
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b83bd902cec..f7bb193b51d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16410,7 +16410,8 @@ Note that this requires platform-specific runtime 
support that does
 not exist everywhere.  Moreover, it only allows @emph{trapping}
 instructions to throw exceptions, i.e.@: memory references or floating-point
 instructions.  It does not allow exceptions to be thrown from
-arbitrary signal handlers such as @code{SIGALRM}.
+arbitrary signal handlers such as @code{SIGALRM}.  This enables
+@option{-fexceptions}.
 
 @item -fdelete-dead-exceptions
 @opindex fdelete-dead-exceptions
-- 
2.31.1

Re: [PATCH] Make sure -fexceptions is enabled when -fnon-call-exceptions is

2021-08-30 Thread Eric Botcazou

> This makes -fexceptions enabled by -fnon-call-exceptions, removing
> the odd state of !flag_exceptions && flag_non_call_exceptions from
> middle-end consideration.

FWIW fine with me.

-- 
Eric Botcazou

Re: [PATCH][pushed] Add -fprofile-reproducible=parallel-runs to STAGEfeedback_CFLAGS to Makefile.tpl.

2021-08-30 Thread Martin Liška


On 8/27/21 19:45, Gleb Fotengauer-Malinovskiy wrote:

Hi,

On Thu, Mar 11, 2021 at 04:19:51PM +0100, Martin Liška wrote:

Pushed as obvious, the original change was done
in g:e05a117dc4b98f3ac60851608f532ba7cee7343a.

Martin

ChangeLog:

* Makefile.tpl: The change was done Makefile.in which
is generated file.
---
   Makefile.tpl | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile.tpl b/Makefile.tpl
index 3b88f351d5b..6e0337fb48f 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -488,7 +488,7 @@ STAGEprofile_TFLAGS = $(STAGE2_TFLAGS)
   STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS))
   STAGEtrain_TFLAGS = $(filter-out -fchecking=1,$(STAGE3_TFLAGS))
   
-STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use

+STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use 
-fprofile-reproducible=parallel-runs


Did you mean to add -fprofile-reproducible flag to STAGEprofile_CFLAGS
instead?


No.


I suppose this flag doesn't mean anything without
-fprofile-generate, which is passed through STAGEprofile_CFLAGS.


The flag -fprofile-reproducible is used when we use a collected profile, so it 
works
*only* with -fprofile-use flag.

Thanks for heads up!
Cheers,
Martin



Thanks,

Re: [PATCH] libffi: Fix MIPS r6 support

2021-08-30 Thread YunQiang Su


在 2021/8/30 5:00, Jeff Law 写道:



On 8/28/2021 1:23 AM, Xi Ruoyao wrote:

On Fri, 2021-08-27 at 15:28 -0600, Jeff Law via Gcc-patches wrote:


On 8/26/2021 10:58 PM, YunQiang Su wrote:

for some instructions, MIPS r6 uses different encoding other than
the previous releases.

1. mips/n32.S disable .set mips4: since it casuses old insn encoding
 is used.
 https://github.com/libffi/libffi/pull/396
2. mips/ffi.c: the encoding for JR is hardcoded: we need to use
 different value for r6 and pre-r6.
 https://github.com/libffi/libffi/pull/401

libffi/
 PR libffi/83636
 * src/mips/n32.S: disable .set mips4
 * src/mips/ffi.c: use different JR encoding for r6.

These should go to the upstream libffi project.  Once accepted there
you
can add them to GCC.

Hi Jeff,

The two PRs are already merged, and released since libffi-3.3.0 (now the
upstream latest release is 3.4.2).
ACK.  Thanks for confirming.  It's always OK to cherrypick/backport from 
libffi back to GCC.




I don't have a MIPSr6 so I can't test though.
Understood.   Me neither, but I really should get a tiny chroot for 
mipsr6 so that my tester can validate it regularly.




We have port Debian to MIPS64r6el.
http://58.246.137.130:20180/tools/tarball/
You can use both buster(Debian 10) or bullseye (Debian 11).

And both qemu-system and qemu-user can work.



Jeff
.

Re: [PATCH v3] Fix incomplete computation in fill_always_executed_in_1

2021-08-30 Thread Xionghu Luo via Gcc-patches




On 2021/8/27 15:45, Richard Biener wrote:

On Thu, 26 Aug 2021, Xionghu Luo wrote:




On 2021/8/24 16:20, Richard Biener wrote:

On Tue, 24 Aug 2021, Xionghu Luo wrote:




On 2021/8/19 20:11, Richard Biener wrote:

-  class loop *inn_loop = loop;

  if (ALWAYS_EXECUTED_IN (loop->header) == NULL)

{
@@ -3232,19 +3231,6 @@ fill_always_executed_in_1 (class loop *loop, sbitmap 
contains_call)
 to disprove this if possible).  */
  if (bb->flags & BB_IRREDUCIBLE_LOOP)
break;
-
- if (!flow_bb_inside_loop_p (inn_loop, bb))
-   break;
-
- if (bb->loop_father->header == bb)
-   {
- if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
-   break;
-
- /* In a loop that is always entered we may proceed anyway.
-But record that we entered it and stop once we leave it.  */
- inn_loop = bb->loop_father;
-   }
}

  while (1)

I'm not sure this will work correct (I'm not sure how the existing
code makes it so either...).  That said, I can't poke any hole
into the change.  What I see is that definitely

 if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
   last = bb;

 if (bitmap_bit_p (contains_call, bb->index))
   break;

doesn't work reliably since the DOM ordering will process blocks
A B and C in random order for

 for (;;)
  {
 if (cond)
   {
 A: foo ();
   }
 else B:;
 C:;
  }

and thus we can end up setting 'last' to C_before_  processing
'A' and thus arriving at the call foo () ...

get_loop_body_in_dom_order does some "special sauce" but not
to address the above problem - but it might be that a subtle
issue like the above is the reason for the inner loop handling.
The inner loop block order does_not_  adhere to this "special sauce",
that is - the "Additionally, if a basic block s dominates
the latch, then only blocks dominated by s are be after it."
guarantee holds for the outer loop latch, not for the inner.

Digging into the history of fill_always_executed_in_1 doesn't
reveal anything - the inner loop handling has been present
since introduction by Zdenek - but usually Zdenek has a reason
for doing things as he does;)


Yes, this is really complicated usage, thanks for point it out. :)
I constructed two cases to verify this with inner loop includes "If A; else B; 
C".
Finding that fill_sons_in_loop in get_loop_body_in_dom_order will also checks
whether the bb domintes outer loop’s latch, if C dominate outer loop’s latch,
C is postponed, the access order is ABC, 'last' won’t be set to C if A or B 
contains call;


But it depends on the order of visiting ABC and that's hard to put into
a testcase since it depends on the order of edges and the processing
of the dominance computation.  ABC are simply unordered with respect
to a dominator walk.


Otherwise if C doesn’t dominate outer loop’s latch in fill_sons_in_loop,
the access order is CAB, but 'last' also won’t be updated to C in 
fill_always_executed_in_1
since there is also dominate check, then if A or B contains call, it could break
successfully.

C won't be set to ALWAYS EXECUTED for both circumstance.



Note it might be simply a measure against quadratic complexity,
esp. since with your patch we also dive into not always executed
subloops as you remove the

 if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
   break;

check.  I suggest to evaluate behavior of the patch on a testcase
like

void foo (int n, int **k)
{
 for (int i = 0; i < n; ++i)
   if (k[0][i])
 for (int j = 0; j < n; ++j)
   if (k[1][j])
 for (int l = 0; l < n; ++l)
   if (k[2][l])
 ...
}


Theoretically the complexity is changing from L1(bbs) to 
L1(bbs)+L2(bbs)+L3(bbs)+…+Ln(bbs),
so fill_always_executed_in_1's execution time is supposed to be increase from
O(n) to O(n2)?  The time should depend on loop depth and bb counts.   I also 
drafted a
test case has 73-depth loop function with 25 no-ipa function copies each 
compiled
in lim2 and lim4 dependently.  Total execution time of 
fill_always_executed_in_1 is
increased from 32ms to 58ms, almost doubled but not quadratic?


It's more like n + (n-1) + (n-2) + ... + 1 which is n^2/2 but that's still
O(n^2).


It seems reasonable to see compiling time getting longer since most bbs are 
checked
more but a MUST to ensure early break correctly in every loop level...
Though loop nodes could be huge, loop depth will never be so large in actual 
code?


The "in practice" argument is almost always defeated by automatic
program generators ;)

   
I suspect you'll see quadratic behavior with your patch.  You

should be at least able to preserve a check like

 /* Do not process not always executed subloops to avoid
quadratic behavior.  */

RE: [PATCH take 3] Experimental -fpreserve-traps option

2021-08-30 Thread Roger Sayle



Hi Richard,

> I hoped the one jumping on the "traps" bandwagon would have tackled -ftrapv 
> first.

I did, back in 2003 I introduced (or helped introduce) the flag "-fwrapv" 😊
https://gcc.gnu.org/legacy-ml/gcc-patches/2003-03/msg02126.html

Prior to that, signed overflow was undefined in the middle-end, and it was 
useful to 
have some way to specify the sensible behaviour of having signed and unsigned 
arithmetic
behave the same way.  I'm delighted that those semantics have become the 
default.

Doh!  Sorry for the ENOPATCH.  I'll wait to hear how your experiments with 
-fnon-call-execeptions
go before trying again, but an interesting additional usage is in match.pd:

diff --git a/gcc/match.pd b/gcc/match.pd
index f421c74..1eeb5eb 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -348,7 +348,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  /* X / bool_range_Y is X.  */
  (simplify
   (div @0 SSA_NAME@1)
-  (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1))
+  (if (INTEGRAL_TYPE_P (type)
+   && ssa_name_has_boolean_range (@1)
+   && !flag_preserve_traps)
@0))
  /* X / X is one.  */
  (simplify

This is the transformation that leads to the counter-intuitive "quantum" 
behaviour in PR 77980.
It's difficult to believe that all the language front-ends supported by GCC 
consider this transformation
as "safe".  Perhaps 20 years from now, preserving traps will be the sensible 
default.

Cheers,
Roger
--

-Original Message-
From: Richard Biener  
Sent: 30 August 2021 08:13
To: Roger Sayle 
Cc: GCC Patches ; Eric Botcazou 
Subject: Re: [PATCH take 3] Experimental -fpreserve-traps option

On Sun, Aug 29, 2021 at 11:28 AM Roger Sayle  wrote:
>
>
> This is another attempt to add an -fpreserve-traps command line option 
> to GCC.  Many thanks to Richard Beiner for approving the code clean-up 
> pieces of my earlier submission: This revision contains just the 
> actual functional change(s).
>
> My recent attempt at humour in comment #8 of PR 77980 has reminded me 
> that we still really need to tackle/improve the semantics of trapping 
> instructions.
>
> Firstly, I completely agree with Richard (and Eric) that the current 
> situation with the semantics of -fexceptions and -fnon-call-exceptions 
> (especially the latter without the
> former) is a mess.  Alas, the testsuite (and source) is full of checks 
> that the current status quo is preserved, so any attempts to tweak 
> exception handling leads to a world of pain.

I'm giving the following a try now which at least would remove the 
'-fno-exceptions -fnon-call-exceptions' state (it then might ask for 
unification of the flag_ vars to an enum, but that would be for a followup).

diff --git a/gcc/common.opt b/gcc/common.opt index ed8ab5fbe13..7d69ab5ef7c 
100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1509,7 +1509,7 @@ Common Var(flag_emit_class_debug_always) Init(0)  Do not 
suppress C++ class debug information.

 fexceptions
-Common Var(flag_exceptions) Optimization
+Common Var(flag_exceptions) Optimization 
+EnabledBy(fnon-call-exceptions)
 Enable exception handling.

 fexpensive-optimizations

> The most relevant thing that I would like to point out is that this 
> proposed option, -fpreserve-traps, and the semantics of trapping 
> instructions are completely orthogonal to exception handling.  As the 
> term EH implies, exceptions are a form of flow control (on those 
> targets that support it).  Traps, signals(?), segmentation faults, 
> illegal accesses and invalid instructions are things that can't 
> (always?) be controlled.  Exceptional yes, but not handleable.

Are they really independent?  Consider -fnon-call-exceptions where trapping 
instructions (may) raise exceptions - now with your change (ENOPATCH btw...) 
any such instructions would be considered having side-effects despite the 
'trapping' bit now covered by EH IL.  Now - individual stmts can be marked as 
known to not throw, what does this now mean for its trapping state?  How do 
-fpreserve-traps and -fdelete-dead-exceptions interact here?

I think that going down the -fpreserve-traps way opens a whole can of worms 
with respects to users doing weird things and us having to maintain sth not 
well thought out - so IMHO adding a user-visible option shouldn't be the first 
thing to do.

IIRC we got -ftrapv because there were frontends requiring its semantics 
(Java)?  But even -ftrapv is utterly broken and only "works" in very small 
circumstances (SImode and DImode operations).  But now -ftrapv interacts with 
-fpreserve-traps as well given it makes some (not very well defined) operations 
possibly trapping.

What does preserving a trap mean btw, does it require to preserve the exact 
trapping instruction (a trap handler might inspect the trapping instruction 
after all)?  Or are we allowed to replace a known to be trapping instruction by 
sth else that also traps?  I suppose traps will be non-recoverable (you have 
-fnon-call-exceptions and EH as the only mechanism to re

Re: [PATCH v3] Fix incomplete computation in fill_always_executed_in_1

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, 30 Aug 2021, Xionghu Luo wrote:

> 
> 
> On 2021/8/27 15:45, Richard Biener wrote:
> > On Thu, 26 Aug 2021, Xionghu Luo wrote:
> > 
> >>
> >>
> >> On 2021/8/24 16:20, Richard Biener wrote:
> >>> On Tue, 24 Aug 2021, Xionghu Luo wrote:
> >>>
> 
> 
>  On 2021/8/19 20:11, Richard Biener wrote:
> >> -  class loop *inn_loop = loop;
> >> 
> >>   if (ALWAYS_EXECUTED_IN (loop->header) == NULL)
> >> {
> >> @@ -3232,19 +3231,6 @@ fill_always_executed_in_1 (class loop *loop,
> >> @@ sbitmap contains_call)
> >>   to disprove this if possible).  */
> >>if (bb->flags & BB_IRREDUCIBLE_LOOP)
> >>break;
> >> -
> >> -if (!flow_bb_inside_loop_p (inn_loop, bb))
> >> -  break;
> >> -
> >> -if (bb->loop_father->header == bb)
> >> -  {
> >> -if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
> >> -  break;
> >> -
> >> -/* In a loop that is always entered we may proceed
> >> anyway.
> >> -   But record that we entered it and stop once we leave
> >> it.  */
> >> -inn_loop = bb->loop_father;
> >> -  }
> >>  }
> >> 
> >>   while (1)
> > I'm not sure this will work correct (I'm not sure how the existing
> > code makes it so either...).  That said, I can't poke any hole
> > into the change.  What I see is that definitely
> >
> >  if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
> >last = bb;
> >
> >  if (bitmap_bit_p (contains_call, bb->index))
> >break;
> >
> > doesn't work reliably since the DOM ordering will process blocks
> > A B and C in random order for
> >
> >  for (;;)
> >   {
> >  if (cond)
> >{
> >  A: foo ();
> >}
> >  else B:;
> >  C:;
> >   }
> >
> > and thus we can end up setting 'last' to C_before_  processing
> > 'A' and thus arriving at the call foo () ...
> >
> > get_loop_body_in_dom_order does some "special sauce" but not
> > to address the above problem - but it might be that a subtle
> > issue like the above is the reason for the inner loop handling.
> > The inner loop block order does_not_  adhere to this "special sauce",
> > that is - the "Additionally, if a basic block s dominates
> > the latch, then only blocks dominated by s are be after it."
> > guarantee holds for the outer loop latch, not for the inner.
> >
> > Digging into the history of fill_always_executed_in_1 doesn't
> > reveal anything - the inner loop handling has been present
> > since introduction by Zdenek - but usually Zdenek has a reason
> > for doing things as he does;)
> 
>  Yes, this is really complicated usage, thanks for point it out. :)
>  I constructed two cases to verify this with inner loop includes "If A;
>  else B; C".
>  Finding that fill_sons_in_loop in get_loop_body_in_dom_order will also
>  checks
>  whether the bb domintes outer loop’s latch, if C dominate outer loop’s
>  latch,
>  C is postponed, the access order is ABC, 'last' won’t be set to C if A or
>  B contains call;
> >>>
> >>> But it depends on the order of visiting ABC and that's hard to put into
> >>> a testcase since it depends on the order of edges and the processing
> >>> of the dominance computation.  ABC are simply unordered with respect
> >>> to a dominator walk.
> >>>
>  Otherwise if C doesn’t dominate outer loop’s latch in fill_sons_in_loop,
>  the access order is CAB, but 'last' also won’t be updated to C in
>  fill_always_executed_in_1
>  since there is also dominate check, then if A or B contains call, it
>  could break
>  successfully.
> 
>  C won't be set to ALWAYS EXECUTED for both circumstance.
> 
> >
> > Note it might be simply a measure against quadratic complexity,
> > esp. since with your patch we also dive into not always executed
> > subloops as you remove the
> >
> >  if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
> >break;
> >
> > check.  I suggest to evaluate behavior of the patch on a testcase
> > like
> >
> > void foo (int n, int **k)
> > {
> >  for (int i = 0; i < n; ++i)
> >if (k[0][i])
> >  for (int j = 0; j < n; ++j)
> >if (k[1][j])
> >  for (int l = 0; l < n; ++l)
> >if (k[2][l])
> >  ...
> > }
> 
>  Theoretically the complexity is changing from L1(bbs) to
>  L1(bbs)+L2(bbs)+L3(bbs)+…+Ln(bbs),
>  so fill_always_executed_in_1's execution time is supposed to be increase
>  from
>

Re: [PATCH take 3] Experimental -fpreserve-traps option

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, Aug 30, 2021 at 11:13 AM Roger Sayle  wrote:
>
>
> Hi Richard,
>
> > I hoped the one jumping on the "traps" bandwagon would have tackled -ftrapv 
> > first.
>
> I did, back in 2003 I introduced (or helped introduce) the flag "-fwrapv" 😊
> https://gcc.gnu.org/legacy-ml/gcc-patches/2003-03/msg02126.html
>
> Prior to that, signed overflow was undefined in the middle-end, and it was 
> useful to
> have some way to specify the sensible behaviour of having signed and unsigned 
> arithmetic
> behave the same way.  I'm delighted that those semantics have become the 
> default.

Actually -fno-wrapv is the default nowadays.  That still is a hassle
whenever the
compiler itself tries to emit twos-complement arithmetic - while there's the
possibility to use unsigned arithmetic for most cases there's no substitute
for signed division (albeit the possibly overflowing case is rare).

What's most annoying is IMHO that we have a bunch of flags controlling
overflow behavior - flag_wrapv, flag_trapv plus flag_sanitize &
SANITIZE_SI_OVERFLOW
(and we've had a separate flag_strict_overflow and we do have a
separate flag_wrapv_pointer now)
but not all possible states have well-defined behavior, like
flag_trapv && flag_wrapv (we disallow
this specific combination of course).

> Doh!  Sorry for the ENOPATCH.  I'll wait to hear how your experiments with 
> -fnon-call-execeptions
> go before trying again, but an interesting additional usage is in match.pd:
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index f421c74..1eeb5eb 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -348,7 +348,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   /* X / bool_range_Y is X.  */
>   (simplify
>(div @0 SSA_NAME@1)
> -  (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1))
> +  (if (INTEGRAL_TYPE_P (type)
> +   && ssa_name_has_boolean_range (@1)
> +   && !flag_preserve_traps)
> @0))
>   /* X / X is one.  */
>   (simplify
>
> This is the transformation that leads to the counter-intuitive "quantum" 
> behaviour in PR 77980.
> It's difficult to believe that all the language front-ends supported by GCC 
> consider this transformation
> as "safe".  Perhaps 20 years from now, preserving traps will be the sensible 
> default.

Yes, but there's interplay with what GCC considers undefined behavior.
So wouldn't
it better to have flag_divde_by_zero_traps which says divide by zero
is _not_ undefined
behavior?  I think that -fpreserve-traps applying to -ftrapv is not
what people would expect
(they'd expect dead code to be elided).

Richard.

> Cheers,
> Roger
> --
>
> -Original Message-
> From: Richard Biener 
> Sent: 30 August 2021 08:13
> To: Roger Sayle 
> Cc: GCC Patches ; Eric Botcazou 
> 
> Subject: Re: [PATCH take 3] Experimental -fpreserve-traps option
>
> On Sun, Aug 29, 2021 at 11:28 AM Roger Sayle  
> wrote:
> >
> >
> > This is another attempt to add an -fpreserve-traps command line option
> > to GCC.  Many thanks to Richard Beiner for approving the code clean-up
> > pieces of my earlier submission: This revision contains just the
> > actual functional change(s).
> >
> > My recent attempt at humour in comment #8 of PR 77980 has reminded me
> > that we still really need to tackle/improve the semantics of trapping
> > instructions.
> >
> > Firstly, I completely agree with Richard (and Eric) that the current
> > situation with the semantics of -fexceptions and -fnon-call-exceptions
> > (especially the latter without the
> > former) is a mess.  Alas, the testsuite (and source) is full of checks
> > that the current status quo is preserved, so any attempts to tweak
> > exception handling leads to a world of pain.
>
> I'm giving the following a try now which at least would remove the 
> '-fno-exceptions -fnon-call-exceptions' state (it then might ask for 
> unification of the flag_ vars to an enum, but that would be for a followup).
>
> diff --git a/gcc/common.opt b/gcc/common.opt index ed8ab5fbe13..7d69ab5ef7c 
> 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1509,7 +1509,7 @@ Common Var(flag_emit_class_debug_always) Init(0)  Do 
> not suppress C++ class debug information.
>
>  fexceptions
> -Common Var(flag_exceptions) Optimization
> +Common Var(flag_exceptions) Optimization
> +EnabledBy(fnon-call-exceptions)
>  Enable exception handling.
>
>  fexpensive-optimizations
>
> > The most relevant thing that I would like to point out is that this
> > proposed option, -fpreserve-traps, and the semantics of trapping
> > instructions are completely orthogonal to exception handling.  As the
> > term EH implies, exceptions are a form of flow control (on those
> > targets that support it).  Traps, signals(?), segmentation faults,
> > illegal accesses and invalid instructions are things that can't
> > (always?) be controlled.  Exceptional yes, but not handleable.
>
> Are they really independent?  Consider -fnon-call-exceptions where trapping 
> instructions (may) raise exceptions - now with your change (ENOPATCH btw...) 
>

New Swedish PO file for 'gcc' (version 11.2.0)

2021-08-30 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-11.2.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH] Use __builtin_trap() for abort() if inhibit_libc

2021-08-30 Thread Christophe Lyon via Gcc-patches

Hi,


On Tue, Aug 17, 2021 at 10:43 AM Sebastian Huber <
sebastian.hu...@embedded-brains.de> wrote:

> abort() is used in gcc_assert() and gcc_unreachable() which is used by
> target
> libraries such as libgcov.a.  This patch changes the abort() definition
> under
> certain conditions.  If inhibit_libc is defined and abort is not already
> defined, then abort() is defined to __builtin_trap().
>
> The inhibit_libc define is usually defined if GCC is built for targets
> running
> in embedded systems which may optionally use a C standard library.  If
> inhibit_libc is defined, then there may be still a full featured abort()
> available.  abort() is a heavy weight function which depends on signals and
> file streams.  For statically linked applications, this means that a
> dependency
> on gcc_assert() pulls in the support for signals and file streams.  This
> could
> prevent using gcov to test low end targets for example.  Using
> __builtin_trap()
> avoids these dependencies if the target implements a "trap" instruction.
> The
> application or operating system could use a trap handler to react to
> failed GCC
> runtime checks which caused a trap.
>
> gcc/
>
> * tsystem.h (abort): Define abort() if inhibit_libc is defined and
> it
> is not already defined.
>


This causes a build failure on arm:
In file included from /libgcc/config/arm/unwind-arm.c:144:
/libgcc/unwind-arm-common.inc:55:24: error: macro "abort" passed 1
arguments, but takes just 0
   55 | extern void abort (void);

This small patch should fix it:
 diff --git a/libgcc/unwind-arm-common.inc b/libgcc/unwind-arm-common.inc
index c9b70c10d4f..77ec02ec811 100644
--- a/libgcc/unwind-arm-common.inc
+++ b/libgcc/unwind-arm-common.inc
@@ -50,10 +50,6 @@
 #define ARM_SIGCONTEXT_R0  0xc
 #endif

-/* We add a prototype for abort here to avoid creating a dependency on
-   target headers.  */
-extern void abort (void);
-
 /* Definitions for C++ runtime support routines.  We make these weak
declarations to avoid pulling in libsupc++ unnecessarily.  */
 typedef unsigned char bool;


Christophe



> ---
>  gcc/tsystem.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/tsystem.h b/gcc/tsystem.h
> index e1e6a96a4f48..5c72c69ff3ed 100644
> --- a/gcc/tsystem.h
> +++ b/gcc/tsystem.h
> @@ -59,7 +59,7 @@ extern int atexit (void (*)(void));
>  #endif
>
>  #ifndef abort
> -extern void abort (void) __attribute__ ((__noreturn__));
> +#define abort() __builtin_trap ()
>  #endif
>
>  #ifndef strlen
> --
> 2.26.2
>
>

[PATCH] [i386] Unify UNSPEC_MASKED_EQ/GT to the form of UNSPEC_PCMP.

2021-08-30 Thread liuhongt via Gcc-patches

Currently for evex vpcmpeqb instruction, we have two forms of rtl
template representation, one is (unspec [op1 op2] UNSPEC_MASK_EQ), the
other is (unspec [op1, op2, const_int 0] UNSPEC_PCMP), which increases
the maintenance burden, such as optimization (not: vpcmpeqb)
to (vpcmpneqb) requires two define_insn_and_split to match the two
forms respectively, this patch removes UNSPEC_MASK_EQ/GT, unifying
them into the form of UNSPEC_PCMP.

  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
  Pushed to trunk.

gcc/ChangeLog:

* config/i386/sse.md (*_ucmp3_1): Change from
define_split to define_insn_and_split.
(*avx2_eq3): Removed.
(_eq3): Adjust pattern
(_eq3_1): Rename to ..
(*_eq3_1): .. this, and
adjust pattern.
(*avx2_gt3): Removed.
(_gt3): Change from
define_insn to define_expand, and adjust pattern.
(UNSPEC_MASKED_EQ, UNSPEC_MASKED_GT): Removed.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512bw-vpcmpeqb-1.c: Adjust testcase.
* gcc.target/i386/avx512bw-vpcmpeqw-1.c: Ditto.
* gcc.target/i386/avx512bw-vpcmpgtb-1.c: Ditto.
* gcc.target/i386/avx512bw-vpcmpgtw-1.c: Ditto.
* gcc.target/i386/avx512f-vpcmpeqd-1.c: Ditto.
* gcc.target/i386/avx512f-vpcmpeqq-1.c: Ditto.
* gcc.target/i386/avx512f-vpcmpgtd-1.c: Ditto.
* gcc.target/i386/avx512f-vpcmpgtq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpeqd-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpeqq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpgtd-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpgtq-1.c: Ditto.
* gcc.target/i386/bitwise_mask_op-1.c: Ditto.
* gcc.target/i386/bitwise_mask_op-2.c: Ditto.
---
 gcc/config/i386/sse.md| 100 ++
 .../gcc.target/i386/avx512bw-vpcmpeqb-1.c |  12 +--
 .../gcc.target/i386/avx512bw-vpcmpeqw-1.c |  12 +--
 .../gcc.target/i386/avx512bw-vpcmpgtb-1.c |  12 +--
 .../gcc.target/i386/avx512bw-vpcmpgtw-1.c |  12 +--
 .../gcc.target/i386/avx512f-vpcmpeqd-1.c  |   4 +-
 .../gcc.target/i386/avx512f-vpcmpeqq-1.c  |   4 +-
 .../gcc.target/i386/avx512f-vpcmpgtd-1.c  |   4 +-
 .../gcc.target/i386/avx512f-vpcmpgtq-1.c  |   4 +-
 .../gcc.target/i386/avx512vl-vpcmpeqd-1.c |   8 +-
 .../gcc.target/i386/avx512vl-vpcmpeqq-1.c |   8 +-
 .../gcc.target/i386/avx512vl-vpcmpgtd-1.c |   8 +-
 .../gcc.target/i386/avx512vl-vpcmpgtq-1.c |   8 +-
 .../gcc.target/i386/bitwise_mask_op-1.c   |   6 --
 .../gcc.target/i386/bitwise_mask_op-2.c   |   1 -
 15 files changed, 78 insertions(+), 125 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index ac0c46328f2..5785e73241c 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -100,8 +100,6 @@ (define_c_enum "unspec" [
   UNSPEC_COMPRESS
   UNSPEC_COMPRESS_STORE
   UNSPEC_EXPAND
-  UNSPEC_MASKED_EQ
-  UNSPEC_MASKED_GT
 
   ;; Mask operations
   UNSPEC_MASKOP
@@ -3426,7 +3424,7 @@ (define_int_attr pcmp_signed_mask
 ;; For signed comparison, handle EQ 0: NEQ 4,
 ;; for unsigned comparison extra handle LE:2, NLE:6, equivalent to EQ and NEQ.
 
-(define_split
+(define_insn_and_split "*_ucmp3_1"
   [(set (match_operand: 0 "register_operand")
(unspec:
  [(us_minus:VI12_AVX512VL
@@ -3435,9 +3433,11 @@ (define_split
   (match_operand:VI12_AVX512VL 3 "const0_operand")
   (match_operand:SI 4 "const_0_to_7_operand")]
  UNSPEC_PCMP_ITER))]
-  "TARGET_AVX512BW
+  "TARGET_AVX512BW && ix86_pre_reload_split ()
   && ix86_binary_operator_ok (US_MINUS, mode, operands)
   && (INTVAL (operands[4]) & ) == 0"
+  "#"
+  "&& 1"
   [(const_int 0)]
   {
 /* LE: 2, NLT: 5, NLE: 6, LT: 1  */
@@ -13801,24 +13801,6 @@ (define_insn "*avx2_eq3"
(set_attr "prefix" "vex")
(set_attr "mode" "OI")])
 
-(define_insn_and_split "*avx2_eq3"
- [(set (match_operand:VI_128_256  0 "register_operand")
-   (vec_merge:VI_128_256
- (match_operand:VI_128_256 1 "vector_all_ones_operand")
- (match_operand:VI_128_256 2 "const0_operand")
- (unspec:
-   [(match_operand:VI_128_256 3 "nonimmediate_operand")
-(match_operand:VI_128_256 4 "nonimmediate_operand")]
-UNSPEC_MASKED_EQ)))]
-  "TARGET_AVX512VL && ix86_pre_reload_split ()
-  && !(MEM_P (operands[3]) && MEM_P (operands[4]))"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-   (eq:VI_128_256
-  (match_dup 3)
-  (match_dup 4)))])
-
 (define_insn_and_split "*avx2_pcmp3_1"
  [(set (match_operand:VI_128_256  0 "register_operand")
(vec_merge:VI_128_256
@@ -13978,8 +13960,9 @@ (define_expand 
"_eq3"
   [(set (match_operand: 0 "register_operand")
(unspec:
  [(match_operand:VI12_AVX512VL 1 "nonimmediate_operand")
-  (match_operand:VI12_AVX512VL 2 "nonimmediate_operand")]
- UNSPEC_MASKED_EQ))]
+  (match_operand:VI12_AVX512VL 2 "nonim

Fix 'hash_table::expand' to destruct stale Value objects (was: 'hash_map>')

2021-08-30 Thread Thomas Schwinge

Hi!

Ping -- we still need to plug the memory leak; see patch attached, and/or
long discussion here:

On 2021-08-16T14:10:00-0600, Martin Sebor  wrote:
> On 8/16/21 6:44 AM, Thomas Schwinge wrote:
>> On 2021-08-12T17:15:44-0600, Martin Sebor via Gcc  wrote:
>>> On 8/6/21 10:57 AM, Thomas Schwinge wrote:
 So I'm trying to do some C++...  ;-)

 Given:

   /* A map from SSA names or var decls to record fields.  */
   typedef hash_map field_map_t;

   /* For each propagation record type, this is a map from SSA names or 
 var decls
  to propagate, to the field in the record type that should be used 
 for
  transmission and reception.  */
   typedef hash_map record_field_map_t;

 Thus, that's a 'hash_map>'.  (I may do that,
 right?)  Looking through GCC implementation files, very most of all uses
 of 'hash_map' boil down to pointer key ('tree', for example) and
 pointer/integer value.
>>>
>>> Right.  Because most GCC containers rely exclusively on GCC's own
>>> uses for testing, if your use case is novel in some way, chances
>>> are it might not work as intended in all circumstances.
>>>
>>> I've wrestled with hash_map a number of times.  A use case that's
>>> close to yours (i.e., a non-trivial value type) is in cp/parser.c:
>>> see class_to_loc_map_t.
>>
>> Indeed, at the time you sent this email, I already had started looking
>> into that one!  (The Fortran test cases that I originally analyzed, which
>> triggered other cases of non-POD/non-trivial destructor, all didn't
>> result in a memory leak, because the non-trivial constructor doesn't
>> actually allocate any resources dynamically -- that's indeed different in
>> this case here.)  ..., and indeed:
>>
>>> (I don't remember if I tested it for leaks
>>> though.  It's used to implement -Wmismatched-tags so compiling
>>> a few tests under Valgrind should show if it does leak.)
>>
>> ... it does leak memory at present.  :-| (See attached commit log for
>> details for one example.)

(Attached "Fix 'hash_table::expand' to destruct stale Value objects"
again.)

>> To that effect, to document the current behavior, I propose to
>> "Add more self-tests for 'hash_map' with Value type with non-trivial
>> constructor/destructor"

(We've done that in commit e4f16e9f357a38ec702fb69a0ffab9d292a6af9b
"Add more self-tests for 'hash_map' with Value type with non-trivial
constructor/destructor", quickly followed by bug fix
commit bb04a03c6f9bacc890118b9e12b657503093c2f8
"Make 'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand'
work on 32-bit architectures [PR101959]".

>> (Also cherry-pick into release branches, eventually?)

 Then:

   record_field_map_t field_map ([...]); // see below
   for ([...])
 {
   tree record_type = [...];
   [...]
   bool existed;
   field_map_t &fields
 = field_map.get_or_insert (record_type, &existed);
   gcc_checking_assert (!existed);
   [...]
   for ([...])
 fields.put ([...], [...]);
   [...]
 }
   [stuff that looks up elements from 'field_map']
   field_map.empty ();

 This generally works.

 If I instantiate 'record_field_map_t field_map (40);', Valgrind is happy.
 If however I instantiate 'record_field_map_t field_map (13);' (where '13'
 would be the default for 'hash_map'), Valgrind complains:

   2,080 bytes in 10 blocks are definitely lost in loss record 828 of 
 876
  at 0x483DD99: calloc (vg_replace_malloc.c:762)
  by 0x175F010: xcalloc (xmalloc.c:162)
  by 0xAF4A2C: hash_table>>> simple_hashmap_traits, tree_node*> 
 >::hash_entry, false, xcallocator>::hash_table(unsigned long, bool, bool, 
 bool, mem_alloc_origin) (hash-table.h:275)
  by 0x15E0120: hash_map>>> simple_hashmap_traits, tree_node*> 
 >::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143)
  by 0x15DEE87: hash_map>>> tree_node*, simple_hashmap_traits, 
 tree_node*> >, simple_hashmap_traits, 
 hash_map>>> simple_hashmap_traits, tree_node*> > > 
 >::get_or_insert(tree_node* const&, bool*) (hash-map.h:205)
  by 0x15DD52C: execute_omp_oacc_neuter_broadcast() 
 (omp-oacc-neuter-broadcast.cc:1371)
  [...]

 (That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc'
 file.)

 My suspicion was that it is due to the 'field_map' getting resized as it
 incrementally grows (and '40' being big enough for that to never happen),
 and somehow the non-POD (?) value objects not being properly handled
 during that.  Working my way a bit through 'gcc/hash-map.*' and
 'gcc/hash-table.*' (but not claiming that I understand all that, off
 hand), it seems as if my theory is right: I'm

[wwwdocs] gcc-12/changes.html: nvptx - new __PTX_SM__ macro

2021-08-30 Thread Tobias Burnus


Document Roger's patch
https://gcc.gnu.org/g:3c496e92d795a8fe5c527e3c5b5a6606669ae50d

OK? Suggestions?

Tobias

PS: I have a pending wwwdocs patch for OpenMP, but I think I will hold
off with an updated version until Jakub's next patch – to avoid too many
updates.

PPS: I also have a pending wwwdocs patch for GCN, but there one tuning
follow-up patch is missing before it is fully implemented. Hence,
holding off that one as well.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
gcc-12/changes.html: nvptx - new __PTX_SM__ macro

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index a8859882..d217f1c8 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -149,6 +149,8 @@ a work-in-progress.
   for the generated code; permitted values are 3.1
   (default, matches previous GCC versions) and 6.3.
   
+  The __PTX_SM__ predefined macro allows code to check the
+  compute model being targeted by the compiler.

Re: [PATCH] Use __builtin_trap() for abort() if inhibit_libc

2021-08-30 Thread Sebastian Huber


Hello Christophe,

it seems there are a couple of more abort() declarations:

libgcc/unwind-arm-common.inc:extern void abort (void);
libgcc/config/c6x/pr-support.c:extern void abort (void);
libgcc/config/arm/pr-support.c:extern void abort (void);
libgcc/config/arm/linux-atomic-64bit.c:extern void abort (void);
libgcc/fp-bit.c:   external to abort if abort is not used by the 
function, and the stubs

libgcc/fp-bit.c:extern void abort (void);

I will prepare a patch.

--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/

Re: [RFA] Some libgcc headers are missing the runtime exception

2021-08-30 Thread Thomas Schwinge

Hi!

Ping.  For easy reference I've again attached Richard Sandiford's
"libgcc: Add missing runtime exception notices".

On 2021-07-12T17:34:09+0100, Richard Sandiford via Gcc-patches 
 wrote:
> David Edelsohn  writes:
>> On Mon, Jul 12, 2021 at 11:58 AM Richard Sandiford
>>  wrote:
>>> David Edelsohn  writes:
>>> > On Fri, Jul 9, 2021 at 1:31 PM Richard Sandiford
>>> >  wrote:
>>> >> David Edelsohn  writes:
>>> >> > On Fri, Jul 9, 2021 at 12:53 PM Richard Sandiford via Gcc
>>> >> >  wrote:
>>> >> >> It was pointed out to me off-list that config/aarch64/value-unwind.h
>>> >> >> is missing the runtime exception.  It looks like a few other files
>>> >> >> are too; a fuller list is:
>>> >> >>
>>> >> >> libgcc/config/aarch64/value-unwind.h
>>> >> >> libgcc/config/frv/frv-abi.h
>>> >> >> libgcc/config/i386/value-unwind.h
>>> >> >> libgcc/config/pa/pa64-hpux-lib.h
>>> >> >>
>>> >> >> Certainly for the aarch64 file this was simply a mistake;
>>> >> >> it seems to have been copied from the i386 version, both of which
>>> >> >> reference the runtime exception but don't actually include it.
>>> >> >>
>>> >> >> What's the procedure for fixing this?  Can we treat it as a textual
>>> >> >> error or do the files need to be formally relicensed?
>>> >> >
>>> >> > I'm unsure what you mean by "formally relicensed".
>>> >>
>>> >> It seemed like there were two possibilities: the licence of the files
>>> >> is actually GPL + exception despite what the text says (the textual
>>> >> error case), or the licence of the files is plain GPL because the text
>>> >> has said so since the introduction of the files.  In the latter case
>>> >> I'd have imagined that someone would need to relicense the code so
>>> >> that it is GPL + exception.
>>> >>
>>> >> > It generally is considered a textual omission.  The runtime library
>>> >> > components of GCC are intended to be licensed under the runtime
>>> >> > exception, which was granted and approved at the time of introduction.
>>> >>
>>> >> OK, thanks.  So would a patch to fix at least the i386 and aarch64 header
>>> >> files be acceptable?  (I'm happy to fix the other two as well if that's
>>> >> definitely the right thing to do.  It's just that there's more history
>>> >> involved there…)
>>> >
>>> > Please correct the text in the files. The files in libgcc used in the
>>> > GCC runtime are intended to be licensed with the runtime exception and
>>> > GCC previously was granted approval for that licensing and purpose.
>>> >
>>> > As you are asking the question, I sincerely doubt that ARM and Cavium
>>> > intended to apply a license without the exception to those files.  And
>>> > similarly for Intel and FRV.
>>>
>>> FTR, I think only Linaro (rather than Arm) touched the aarch64 file.
>>>
>>> > The runtime exception explicitly was intended for this purpose and
>>> > usage at the time that GCC received approval to apply the exception.
>>>
>>> Ack.  Is the patch below OK for trunk and branches?
>>
>> I'm not certain whom you are asking for approval,
>
> I was assuming it would need a global reviewer.
>
>> but it looks good to me.
>
> Thanks.

So in addition to David, would a Global Reviewer please review this?


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From a601ac8ea9be14a898215456c22cd826e8fd92d9 Mon Sep 17 00:00:00 2001
From: Richard Sandiford 
Date: Mon, 12 Jul 2021 13:04:56 +0100
Subject: [PATCH] libgcc: Add missing runtime exception notices
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Quoting from https://gcc.gnu.org/pipermail/gcc/2021-July/236716.html:


It was pointed out to me off-list that config/aarch64/value-unwind.h
is missing the runtime exception.  It looks like a few other files
are too; a fuller list is:

libgcc/config/aarch64/value-unwind.h
libgcc/config/frv/frv-abi.h
libgcc/config/i386/value-unwind.h
libgcc/config/pa/pa64-hpux-lib.h

Certainly for the aarch64 file this was simply a mistake;
it seems to have been copied from the i386 version, both of which
reference the runtime exception but don't actually include it.


Similarly, frv-abi.h referenced the exception but didn't include it.
pa64-hpux-lib.h was missing any reference to the exception.

The decision was that this was simply a mistake
[https://gcc.gnu.org/pipermail/gcc/2021-July/236717.html]:


[…] It generally is
considered a textual omission.  The runtime library components of GCC
are intended to be licensed under the runtime exception, which was
granted and approved at the time of introduction.
--

Re: [PATCH] Use __builtin_trap() for abort() if inhibit_libc

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, Aug 30, 2021 at 12:55 PM Sebastian Huber
 wrote:
>
> Hello Christophe,
>
> it seems there are a couple of more abort() declarations:
>
> libgcc/unwind-arm-common.inc:extern void abort (void);
> libgcc/config/c6x/pr-support.c:extern void abort (void);
> libgcc/config/arm/pr-support.c:extern void abort (void);
> libgcc/config/arm/linux-atomic-64bit.c:extern void abort (void);
> libgcc/fp-bit.c:   external to abort if abort is not used by the
> function, and the stubs
> libgcc/fp-bit.c:extern void abort (void);
>
> I will prepare a patch.

Less likley to break might be to simply wrap 'abort'
in parentheses to avoid macro expansion?

> --
> embedded brains GmbH
> Herr Sebastian HUBER
> Dornierstr. 4
> 82178 Puchheim
> Germany
> email: sebastian.hu...@embedded-brains.de
> phone: +49-89-18 94 741 - 16
> fax:   +49-89-18 94 741 - 08
>
> Registergericht: Amtsgericht München
> Registernummer: HRB 157899
> Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
> Unsere Datenschutzerklärung finden Sie hier:
> https://embedded-brains.de/datenschutzerklaerung/

Re: [RFA] Some libgcc headers are missing the runtime exception

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, Aug 30, 2021 at 12:59 PM Thomas Schwinge
 wrote:
>
> Hi!
>
> Ping.  For easy reference I've again attached Richard Sandiford's
> "libgcc: Add missing runtime exception notices".
>
> On 2021-07-12T17:34:09+0100, Richard Sandiford via Gcc-patches 
>  wrote:
> > David Edelsohn  writes:
> >> On Mon, Jul 12, 2021 at 11:58 AM Richard Sandiford
> >>  wrote:
> >>> David Edelsohn  writes:
> >>> > On Fri, Jul 9, 2021 at 1:31 PM Richard Sandiford
> >>> >  wrote:
> >>> >> David Edelsohn  writes:
> >>> >> > On Fri, Jul 9, 2021 at 12:53 PM Richard Sandiford via Gcc
> >>> >> >  wrote:
> >>> >> >> It was pointed out to me off-list that config/aarch64/value-unwind.h
> >>> >> >> is missing the runtime exception.  It looks like a few other files
> >>> >> >> are too; a fuller list is:
> >>> >> >>
> >>> >> >> libgcc/config/aarch64/value-unwind.h
> >>> >> >> libgcc/config/frv/frv-abi.h
> >>> >> >> libgcc/config/i386/value-unwind.h
> >>> >> >> libgcc/config/pa/pa64-hpux-lib.h
> >>> >> >>
> >>> >> >> Certainly for the aarch64 file this was simply a mistake;
> >>> >> >> it seems to have been copied from the i386 version, both of which
> >>> >> >> reference the runtime exception but don't actually include it.
> >>> >> >>
> >>> >> >> What's the procedure for fixing this?  Can we treat it as a textual
> >>> >> >> error or do the files need to be formally relicensed?
> >>> >> >
> >>> >> > I'm unsure what you mean by "formally relicensed".
> >>> >>
> >>> >> It seemed like there were two possibilities: the licence of the files
> >>> >> is actually GPL + exception despite what the text says (the textual
> >>> >> error case), or the licence of the files is plain GPL because the text
> >>> >> has said so since the introduction of the files.  In the latter case
> >>> >> I'd have imagined that someone would need to relicense the code so
> >>> >> that it is GPL + exception.
> >>> >>
> >>> >> > It generally is considered a textual omission.  The runtime library
> >>> >> > components of GCC are intended to be licensed under the runtime
> >>> >> > exception, which was granted and approved at the time of 
> >>> >> > introduction.
> >>> >>
> >>> >> OK, thanks.  So would a patch to fix at least the i386 and aarch64 
> >>> >> header
> >>> >> files be acceptable?  (I'm happy to fix the other two as well if that's
> >>> >> definitely the right thing to do.  It's just that there's more history
> >>> >> involved there…)
> >>> >
> >>> > Please correct the text in the files. The files in libgcc used in the
> >>> > GCC runtime are intended to be licensed with the runtime exception and
> >>> > GCC previously was granted approval for that licensing and purpose.
> >>> >
> >>> > As you are asking the question, I sincerely doubt that ARM and Cavium
> >>> > intended to apply a license without the exception to those files.  And
> >>> > similarly for Intel and FRV.
> >>>
> >>> FTR, I think only Linaro (rather than Arm) touched the aarch64 file.
> >>>
> >>> > The runtime exception explicitly was intended for this purpose and
> >>> > usage at the time that GCC received approval to apply the exception.
> >>>
> >>> Ack.  Is the patch below OK for trunk and branches?
> >>
> >> I'm not certain whom you are asking for approval,
> >
> > I was assuming it would need a global reviewer.
> >
> >> but it looks good to me.
> >
> > Thanks.
>
> So in addition to David, would a Global Reviewer please review this?

OK.

Thanks,
Richard.

>
> Grüße
>  Thomas
>
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

Re: [PATCH] Use __builtin_trap() for abort() if inhibit_libc

2021-08-30 Thread Sebastian Huber


On 30/08/2021 13:44, Richard Biener wrote:

On Mon, Aug 30, 2021 at 12:55 PM Sebastian Huber
  wrote:

Hello Christophe,

it seems there are a couple of more abort() declarations:

libgcc/unwind-arm-common.inc:extern void abort (void);
libgcc/config/c6x/pr-support.c:extern void abort (void);
libgcc/config/arm/pr-support.c:extern void abort (void);
libgcc/config/arm/linux-atomic-64bit.c:extern void abort (void);
libgcc/fp-bit.c:   external to abort if abort is not used by the
function, and the stubs
libgcc/fp-bit.c:extern void abort (void);

I will prepare a patch.

Less likley to break might be to simply wrap 'abort'
in parentheses to avoid macro expansion?


I think the abort declaration in libgcc/unwind-arm-common.inc is the 
only problematic spot since unwind-arm-common.inc includes tsystem.h.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/

[PATCH] Fix arm target build with inhibit_libc

2021-08-30 Thread Sebastian Huber

Do not declare abort in "libgcc/unwind-arm-common.inc" since it is already
provided by "tsystem.h".  It fixes the following build error:

In file included from libgcc/config/arm/unwind-arm.c:144:
libgcc/unwind-arm-common.inc:55:24: error: macro "abort" passed 1 arguments, 
but takes just 0
   55 | extern void abort (void);

libgcc/

* unwind-arm-common.inc (abort): Remove.
---
 libgcc/unwind-arm-common.inc | 4 
 1 file changed, 4 deletions(-)

diff --git a/libgcc/unwind-arm-common.inc b/libgcc/unwind-arm-common.inc
index c9b70c10d4f5..77ec02ec811e 100644
--- a/libgcc/unwind-arm-common.inc
+++ b/libgcc/unwind-arm-common.inc
@@ -50,10 +50,6 @@
 #define ARM_SIGCONTEXT_R0  0xc
 #endif
 
-/* We add a prototype for abort here to avoid creating a dependency on
-   target headers.  */
-extern void abort (void);
-
 /* Definitions for C++ runtime support routines.  We make these weak
declarations to avoid pulling in libsupc++ unnecessarily.  */
 typedef unsigned char bool;
-- 
2.26.2

Re: [PATCH] Set bound/cmp/control for until wrap loop.

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, 30 Aug 2021, guojiufu wrote:

> On 2021-08-30 14:15, Jiufu Guo wrote:
> > Hi,
> > 
> > In patch r12-3136, niter->control, niter->bound and niter->cmp are
> > derived from number_of_iterations_lt.  While for 'until wrap condition',
> > the calculation in number_of_iterations_lt is not align the requirements
> > on the define of them and requirements in determine_exit_conditions.
> > 
> > This patch calculate niter->control, niter->bound and niter->cmp in
> > number_of_iterations_until_wrap.
> > 
> > The ICEs in the PR are pass with this patch.
> > Bootstrap and reg-tests pass on ppc64/ppc64le and x86.
> > Is this ok for trunk?
> > 
> > BR.
> > Jiufu Guo
> > 
> Add ChangeLog:
> gcc/ChangeLog:
> 
> 2021-08-30  Jiufu Guo  
> 
> PR tree-optimization/102087
> * tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
> Set bound/cmp/control for niter.
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-08-30  Jiufu Guo  
> 
> PR tree-optimization/102087
> * gcc.dg/vect/pr101145_3.c: Update tests.
> * gcc.dg/pr102087.c: New test.
> 
> > ---
> >  gcc/tree-ssa-loop-niter.c  | 14 +-
> >  gcc/testsuite/gcc.dg/pr102087.c| 25 +
> >  gcc/testsuite/gcc.dg/vect/pr101145_3.c |  4 +++-
> >  3 files changed, 41 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/pr102087.c
> > 
> > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> > index 7af92d1c893..747f04d3ce0 100644
> > --- a/gcc/tree-ssa-loop-niter.c
> > +++ b/gcc/tree-ssa-loop-niter.c
> > @@ -1482,7 +1482,7 @@ number_of_iterations_until_wrap (class loop *,
> > tree type, affine_iv *iv0,
> >  affine_iv *iv1, class tree_niter_desc *niter)
> >  {
> >tree niter_type = unsigned_type_for (type);
> > -  tree step, num, assumptions, may_be_zero;
> > +  tree step, num, assumptions, may_be_zero, span;
> >wide_int high, low, max, min;
> > 
> >may_be_zero = fold_build2 (LE_EXPR, boolean_type_node, iv1->base,
> > iv0->base);
> > @@ -1513,6 +1513,8 @@ number_of_iterations_until_wrap (class loop *,
> > tree type, affine_iv *iv0,
> >   low = wi::to_wide (iv0->base);
> > else
> > low = min;
> > +
> > +  niter->control = *iv1;
> >  }
> >/* {base, -C} < n.  */
> >else if (tree_int_cst_sign_bit (iv0->step) && integer_zerop 
> > (iv1->step))
> > @@ -1533,6 +1535,8 @@ number_of_iterations_until_wrap (class loop *,
> > tree type, affine_iv *iv0,
> >   high = wi::to_wide (iv1->base);
> > else
> > high = max;
> > +
> > +  niter->control = *iv0;
> >  }
> >else
> >  return false;

it looks like the above two should already be in effect from the
caller (guarding with integer_nozerop)?

> > @@ -1556,6 +1560,14 @@ number_of_iterations_until_wrap (class loop *,
> > tree type, affine_iv *iv0,
> >niter->assumptions, assumptions);
> > 
> >niter->control.no_overflow = false;
> > +  niter->control.base = fold_build2 (MINUS_EXPR, niter_type,
> > +niter->control.base,
> > niter->control.step);

how do we know IVn - STEP doesn't already wrap?  A comment might be
good to explain you're turning the simplified exit condition into

   { IVbase - STEP, +, STEP } != niter * STEP + (IVbase - STEP)

which, when mathematically looking at it makes me wonder why there's
the seemingly redundant '- STEP' term?  Also is NE_EXPR really
correct since STEP might be not 1?  Only for non equality compares
the '- STEP' should matter?

Richard.

> > +  span = fold_build2 (MULT_EXPR, niter_type, niter->niter,
> > + fold_convert (niter_type, niter->control.step));
> > +  niter->bound = fold_build2 (PLUS_EXPR, niter_type, span,
> > + fold_convert (niter_type, niter->control.base));
> > +  niter->bound = fold_convert (type, niter->bound);
> > +  niter->cmp = NE_EXPR;
> > 
> >return true;
> > }
> > diff --git a/gcc/testsuite/gcc.dg/pr102087.c
> > b/gcc/testsuite/gcc.dg/pr102087.c
> > new file mode 100644
> > index 000..ef1f9f5cba9
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/pr102087.c
> > @@ -0,0 +1,25 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3" } */
> > +
> > +unsigned __attribute__ ((noinline))
> > +foo (int *__restrict__ a, int *__restrict__ b, unsigned l, unsigned n)
> > +{
> > +  while (n < ++l)
> > +*a++ = *b++ + 1;
> > +  return l;
> > +}
> > +
> > +volatile int a[1];
> > +unsigned b;
> > +int c;
> > +
> > +int
> > +check ()
> > +{
> > +  int d;
> > +  for (; b > 1; b++)
> > +for (c = 0; c < 2; c++)
> > +  for (d = 0; d < 2; d++)
> > +   a[0];
> > +  return 0;
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/vect/pr101145_3.c
> > b/gcc/testsuite/gcc.dg/vect/pr101145_3.c
> > index 99289afec0b..40cb0240aaa 100644
> > --- a/gcc/testsuite/gcc.dg/vect/pr101145_3.c
> > +++ b/gcc/testsuite/gcc.dg/vect/pr101145_3.c
> > @@ -1,5 +1,6 @@
> >  /* { dg-require-effective-target ve

[PATCH] tree-optimization/102128 - rework if-converted BB vect heuristic

2021-08-30 Thread Richard Biener via Gcc-patches

This reworks the previous attempt to avoid leaving around if-converted
scalar code in BB vectorized loop bodies to keep costing independent
subgraphs which should address the observed regression with 519.lbm_r.

For this to work we now first cost all subgraphs and only after
doing that proceed to emit vectorized code.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-08-30  Richard Biener  

PR tree-optimization/102128
* tree-vect-slp.c (vect_bb_vectorization_profitable_p):
Move scanning for if-converted scalar code to the caller
and instead delay clearing the visited flag for profitable
subgraphs.
(vect_slp_region): Cost all subgraphs before scheduling.
For if-converted BB vectorization scan for scalar COND_EXPRs
and do not vectorize if any found and the cost model is
very-cheap.
---
 gcc/tree-vect-slp.c | 112 +++-
 1 file changed, 58 insertions(+), 54 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 4d688c7a267..4ca24408249 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -5275,34 +5275,6 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
   vector_costs.safe_splice (instance->cost_vec);
   instance->cost_vec.release ();
 }
-  /* When we're vectorizing an if-converted loop body with the
- very-cheap cost model make sure we vectorized all if-converted
- code.  */
-  bool force_not_profitable = false;
-  if (orig_loop && flag_vect_cost_model == VECT_COST_MODEL_VERY_CHEAP)
-{
-  gcc_assert (bb_vinfo->bbs.length () == 1);
-  for (gimple_stmt_iterator gsi = gsi_start_bb (bb_vinfo->bbs[0]);
-  !gsi_end_p (gsi); gsi_next (&gsi))
-   {
- /* The costing above left us with DCEable vectorized scalar
-stmts having the visited flag set.  */
- if (gimple_visited_p (gsi_stmt (gsi)))
-   continue;
-
- if (gassign *ass = dyn_cast  (gsi_stmt (gsi)))
-   if (gimple_assign_rhs_code (ass) == COND_EXPR)
- {
-   force_not_profitable = true;
-   break;
- }
-   }
-}
-
-  /* Unset visited flag.  */
-  stmt_info_for_cost *cost;
-  FOR_EACH_VEC_ELT (scalar_costs, i, cost)
-gimple_set_visited  (cost->stmt_info->stmt, false);
 
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location, "Cost model analysis: \n");
@@ -5319,6 +5291,7 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
 li_scalar_costs (scalar_costs.length ());
   auto_vec >
 li_vector_costs (vector_costs.length ());
+  stmt_info_for_cost *cost;
   FOR_EACH_VEC_ELT (scalar_costs, i, cost)
 {
   unsigned l = gimple_bb (cost->stmt_info->stmt)->loop_father->num;
@@ -5341,6 +5314,7 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
   /* Now cost the portions individually.  */
   unsigned vi = 0;
   unsigned si = 0;
+  bool profitable = true;
   while (si < li_scalar_costs.length ()
 && vi < li_vector_costs.length ())
 {
@@ -5407,30 +5381,29 @@ vect_bb_vectorization_profitable_p (bb_vec_info 
bb_vinfo,
 example).  */
   if (vec_outside_cost + vec_inside_cost > scalar_cost)
{
- scalar_costs.release ();
- vector_costs.release ();
- return false;
+ profitable = false;
+ break;
}
 }
-  if (vi < li_vector_costs.length ())
+  if (profitable && vi < li_vector_costs.length ())
 {
   if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
 "Excess vector cost for part in loop %d:\n",
 li_vector_costs[vi].first);
-  scalar_costs.release ();
-  vector_costs.release ();
-  return false;
+  profitable = false;
 }
 
-  if (dump_enabled_p () && force_not_profitable)
-dump_printf_loc (MSG_NOTE, vect_location,
-"not profitable because of unprofitable if-converted "
-"scalar code\n");
+  /* Unset visited flag.  This is delayed when the subgraph is profitable
+ and we process the loop for remaining unvectorized if-converted code.  */
+  if (orig_loop && !profitable)
+FOR_EACH_VEC_ELT (scalar_costs, i, cost)
+  gimple_set_visited  (cost->stmt_info->stmt, false);
 
   scalar_costs.release ();
   vector_costs.release ();
-  return !force_not_profitable;
+
+  return profitable;
 }
 
 /* qsort comparator for lane defs.  */
@@ -5884,9 +5857,8 @@ vect_slp_region (vec bbs, 
vec datarefs,
 
  bb_vinfo->shared->check_datarefs ();
 
- unsigned i;
- slp_instance instance;
- FOR_EACH_VEC_ELT (BB_VINFO_SLP_INSTANCES (bb_vinfo), i, instance)
+ auto_vec profitable_subgraphs;
+ for (slp_instance instance : BB_VINFO_SLP_INSTANCES (bb_vinfo))
{
  if (instance->subgraph_entries.is_empty ())
continue;
@@ -5894,9 +5866,7 @

Re: [PATCH] Make sure -fexceptions is enabled when -fnon-call-exceptions is

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, 30 Aug 2021, Eric Botcazou wrote:

> > This makes -fexceptions enabled by -fnon-call-exceptions, removing
> > the odd state of !flag_exceptions && flag_non_call_exceptions from
> > middle-end consideration.
> 
> FWIW fine with me.

Thanks - pushed.

Richard.

RE: [PATCH] [MIPS] Hazard barrier return support

2021-08-30 Thread Dragan Mladjenovic via Gcc-patches



> -Original Message-
> From: Dragan Mladjenovic
> Sent: 17 August 2021 17:59
> To: 'Andrew Pinski' 
> Cc: 'gcc-patches@gcc.gnu.org' 
> Subject: RE: [PATCH] [MIPS] Hazard barrier return support
> 
> 
> 
> > -Original Message-
> > From: Dragan Mladjenovic
> > Sent: 16 August 2021 22:40
> > To: 'Andrew Pinski' 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: RE: [PATCH] [MIPS] Hazard barrier return support
> >
> >
> >
> > > -Original Message-
> > > From: Andrew Pinski [mailto:pins...@gmail.com]
> > > Sent: 16 August 2021 21:17
> > > To: Dragan Mladjenovic 
> > > Cc: gcc-patches@gcc.gnu.org
> > > Subject: Re: [PATCH] [MIPS] Hazard barrier return support
> > >
> > > On Mon, Aug 16, 2021 at 7:43 AM Dragan Mladjenovic via Gcc-patches
> > >  wrote:
> > > >
> > > > This patch allows a function to request clearing of all
> > > > instruction and execution hazards upon normal return via
> > > > __attribute__
> > > ((use_hazard_barrier_return)).
> > > >
> > > > 2017-04-25  Prachi Godbole  
> > > >
> > > > gcc/
> > > > * config/mips/mips.h (machine_function): New variable
> > > > use_hazard_barrier_return_p.
> > > > * config/mips/mips.md (UNSPEC_JRHB): New unspec.
> > > > (mips_hb_return_internal): New insn pattern.
> > > > * config/mips/mips.c (mips_attribute_table): Add attribute
> > > > use_hazard_barrier_return.
> > > > (mips_use_hazard_barrier_return_p): New static function.
> > > > (mips_function_attr_inlinable_p): Likewise.
> > > > (mips_compute_frame_info): Set use_hazard_barrier_return_p.
> > > > Emit error for unsupported architecture choice.
> > > > (mips_function_ok_for_sibcall, mips_can_use_return_insn):
> > > > Return false for use_hazard_barrier_return.
> > > > (mips_expand_epilogue): Emit hazard barrier return.
> > > > * doc/extend.texi: Document use_hazard_barrier_return.
> > > >
> > > > gcc/testsuite/
> > > > * gcc.target/mips/hazard-barrier-return-attribute.c: New test.
> > > > ---
> > > > Rehash of original patch posted by Prachi with minimal changes.
> > > > Tested against mips-mti-elf with mips32r2/-EB and mips32r2/-EB/-
> > micromips.
> > > >
> > > >  gcc/config/mips/mips.c| 58 +--
> > > >  gcc/config/mips/mips.h|  3 +
> > > >  gcc/config/mips/mips.md   | 15 +
> > > >  gcc/doc/extend.texi   |  6 ++
> > > >  .../mips/hazard-barrier-return-attribute.c| 20 +++
> > > >  5 files changed, 98 insertions(+), 4 deletions(-)  create mode
> > > > 100644
> > > > gcc/testsuite/gcc.target/mips/hazard-barrier-return-attribute.c
> > > >
> > > > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index
> > > > 89d1be6cea6..6ce12fce52e 100644
> > > > --- a/gcc/config/mips/mips.c
> > > > +++ b/gcc/config/mips/mips.c
> > > > @@ -630,6 +630,7 @@ static const struct attribute_spec
> > > mips_attribute_table[] = {
> > > >  mips_handle_use_shadow_register_set_attr, NULL },
> > > >{ "keep_interrupts_masked",  0, 0, false, true,  true, false, NULL,
> NULL },
> > > >{ "use_debug_exception_return", 0, 0, false, true, true, false,
> > > > NULL, NULL },
> > > > +  { "use_hazard_barrier_return", 0, 0, true, false, false, false,
> > > > + NULL, NULL },
> > > >{ NULL, 0, 0, false, false, false, false, NULL, NULL }
> > > >  };
> > > >
> > > > @@ -1309,6 +1310,16 @@ mips_use_debug_exception_return_p (tree
> > > type)
> > > >TYPE_ATTRIBUTES (type)) != NULL;  }
> > > >
> > > > +/* Check if the attribute to use hazard barrier return is set for
> > > > +   the function declaration DECL.  */
> > > > +
> > > > +static bool
> > > > +mips_use_hazard_barrier_return_p (const_tree decl) {
> > > > +  return lookup_attribute ("use_hazard_barrier_return",
> > > > +  DECL_ATTRIBUTES (decl)) != NULL; }
> > > > +
> > > >  /* Return the set of compression modes that are explicitly required
> > > > by the attributes in ATTRIBUTES.  */
> > > >
> > > > @@ -1494,6 +1505,19 @@ mips_can_inline_p (tree caller, tree callee)
> > > >return default_target_can_inline_p (caller, callee);  }
> > > >
> > > > +/* Implement TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P.
> > > > +
> > > > +   A function requesting clearing of all instruction and execution
> hazards
> > > > +   before returning cannot be inlined - thereby not clearing any 
> > > > hazards.
> > > > +   All our other function attributes are related to how out-of-line 
> > > > copies
> > > > +   should be compiled or called.  They don't in themselves
> > > > + prevent inlining.  */
> > > > +
> > > > +static bool
> > > > +mips_function_attr_inlinable_p (const_tree decl) {
> > > > +  return !mips_use_hazard_barrier_return_p (decl); }
> > > > +
> > > >  /* Handle an "interrupt" attribute with an optional argument.  */
> > > >
> > > >  static tree
> > > > @@ -7921,6

Re: [PATCH] Check the type of mask while generating cond_op in gimple simplication.

2021-08-30 Thread Richard Biener via Gcc-patches

On Fri, Aug 27, 2021 at 8:53 AM liuhongt  wrote:
>
>   When gimple simplifcation try to combine op and vec_cond_expr to cond_op,
> it doesn't check if mask type matches. It causes an ICE when expand cond_op
> with mismatched mode.
>   This patch add a function named cond_vectorized_internal_fn_supported_p
>  to additionally check mask type than vectorized_internal_fn_supported_p.
>
>   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
>   Ok for trunk?
>
> gcc/ChangeLog:
>
> PR middle-end/102080
> * internal-fn.c (cond_vectorized_internal_fn_supported_p): New 
> functions.
> * internal-fn.h (cond_vectorized_internal_fn_supported_p): New 
> declaration.
> * match.pd: Check the type of mask while generating cond_op in
> gimple simplication.
>
> gcc/testsuite/ChangeLog:
>
> PR middle-end/102080
> * gcc.target/i386/pr102080.c: New test.
> ---
>  gcc/internal-fn.c| 22 ++
>  gcc/internal-fn.h|  1 +
>  gcc/match.pd | 24 
>  gcc/testsuite/gcc.target/i386/pr102080.c | 16 
>  4 files changed, 55 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr102080.c
>
> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> index 1360a00f0b9..8b2b65db1a7 100644
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -4102,6 +4102,28 @@ expand_internal_call (gcall *stmt)
>expand_internal_call (gimple_call_internal_fn (stmt), stmt);
>  }
>
> +/* Check cond_op for vector modes since vectorized_internal_fn_supported_p
> +   doesn't check if mask type matches.  */
> +bool
> +cond_vectorized_internal_fn_supported_p (internal_fn ifn, tree type,
> +tree mask_type)
> +{
> +  if (!vectorized_internal_fn_supported_p (ifn, type))
> +return false;
> +
> +  machine_mode mask_mode;
> +  machine_mode vmode = TYPE_MODE (type);
> +  int size1, size2;
> +  if (VECTOR_MODE_P (vmode)
> +  && targetm.vectorize.get_mask_mode (vmode).exists(&mask_mode)
> +  && GET_MODE_SIZE (mask_mode).is_constant (&size1)
> +  && GET_MODE_SIZE (TYPE_MODE (mask_type)).is_constant (&size2)
> +  && size1 != size2)

Why do we check for equal size rather than just mode equality which
I think would work for non-constant sized modes as well?  And when
using sizes you'd instead use maybe_ne (GET_MODE_SIZE (mask_mode),
GET_MODE_SIZE (TYPE_MODE (mask_type)))

Thanks,
Richard.

> +return false;
> +
> +  return true;
> +}
> +
>  /* If TYPE is a vector type, return true if IFN is a direct internal
> function that is supported for that type.  If TYPE is a scalar type,
> return true if IFN is a direct internal function that is supported for
> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
> index 19d0f849a5a..f0aea00103c 100644
> --- a/gcc/internal-fn.h
> +++ b/gcc/internal-fn.h
> @@ -236,5 +236,6 @@ extern void expand_PHI (internal_fn, gcall *);
>  extern void expand_SHUFFLEVECTOR (internal_fn, gcall *);
>
>  extern bool vectorized_internal_fn_supported_p (internal_fn, tree);
> +extern bool cond_vectorized_internal_fn_supported_p (internal_fn, tree, 
> tree);
>
>  #endif
> diff --git a/gcc/match.pd b/gcc/match.pd
> index e5bbb123a6a..72b1bc674db 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -6987,14 +6987,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   cond_op (COND_BINARY)
>   (simplify
>(vec_cond @0 (view_convert? (uncond_op@4 @1 @2)) @3)
> -  (with { tree op_type = TREE_TYPE (@4); }
> -   (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), 
> op_type)
> +  (with { tree op_type = TREE_TYPE (@4);
> + tree mask_type = TREE_TYPE (@0); }
> +   (if (cond_vectorized_internal_fn_supported_p (as_internal_fn (cond_op),
> +op_type, mask_type)
> && element_precision (type) == element_precision (op_type))
>  (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3))
>   (simplify
>(vec_cond @0 @1 (view_convert? (uncond_op@4 @2 @3)))
> -  (with { tree op_type = TREE_TYPE (@4); }
> -   (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), 
> op_type)
> +  (with { tree op_type = TREE_TYPE (@4);
> + tree mask_type = TREE_TYPE (@0); }
> +   (if (cond_vectorized_internal_fn_supported_p (as_internal_fn (cond_op),
> +op_type, mask_type)
> && element_precision (type) == element_precision (op_type))
>  (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type @1)))
>
> @@ -7003,14 +7007,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   cond_op (COND_TERNARY)
>   (simplify
>(vec_cond @0 (view_convert? (uncond_op@5 @1 @2 @3)) @4)
> -  (with { tree op_type = TREE_TYPE (@5); }
> -   (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), 
> op_type)
> +  (with { tree op_type = TREE_TYPE (@5);

Re: [PATCH v2 1/2] MIPS: use mips_isa enum instead hardcoded numbers

2021-08-30 Thread Jeff Law via Gcc-patches





On 8/30/2021 12:52 AM, Xi Ruoyao via Gcc-patches wrote:

These two patches look good to me.  Still, need a maintainer's approval.

Consider them approved.

Thanks,
Jeff

Re: [committed] Reduce vector comparison of uniform vectors to a scalar comparison

2021-08-30 Thread Jeff Law via Gcc-patches





On 8/30/2021 12:51 AM, Richard Biener wrote:

On Fri, Aug 27, 2021 at 9:31 PM Jeff Law via Gcc-patches
 wrote:


I was working some aspects of our port's vector support and stumbled
across a bit of silly code.  We were comparing two vectors that were
both uniform.

We can simplify a comparison of uniform vectors to a comparison of a
representative element from each.  That in turn may expose const/copy
propagation opportunities or even jump threading opportunities.

And that's precisely what this patch does inside DOM.  At the start of
statement processing we see if we can reduce a vector comparison to a
scalar comparison.

You can see this in pr95308.C on x86.   As we enter dom3 we have:


;;   basic block 2, loop depth 0
;;pred:   ENTRY
e.1_1 = e;
_27 = f_26(D) != 0;
_163 = e.1_1 != 0;
_164 = _27 & _163;
_196 = _164 ? -1 : 0;
vect_cst__194 = {_196, _196, _196, _196, _196, _196, _196, _196};

[ ... ]

;;   basic block 8, loop depth 1
;;pred:   7
;;6
if (vect_cst__194 == { 0, 0, 0, 0, 0, 0, 0, 0 })
  goto ; [100.00%]
else
  goto ; [20.00%]
;;succ:   10
;;9


There's another similar comparison later.  We transform the comparison into:

;;   basic block 8, loop depth 1
;;pred:   7
;;6
if (_196 == 0)
  goto ; [100.00%]
else
  goto ; [20.00%]
;;succ:   13
;;9

There's nice secondary effects that show up after the transformation as
well.

It's an interesting case indeed, but I think a match.pd rule would have
been better to address this.  Sth like

(for cmp (eq ne)
   (simplify
 (cmp vec_same_elem_p@0 vec_same_elem_p@1)
 (if (INTEGRAL_TYPE_P (type)
  && (TREE_CODE (type) == BOOLEAN_TYPE
 || TYPE_PRECISION (type) == 1))
  (cmp { uniform_vector_p (@0); } { uniform_vector_p (@1); }

should do the trick.  That makes the transform available to more places
(and DOM should also pick it up this way).

Good point.  I'll give that a whirl.

jeff

Re: [PATCH] libffi: Fix MIPS r6 support

2021-08-30 Thread Jeff Law via Gcc-patches





On 8/30/2021 2:47 AM, YunQiang Su wrote:

在 2021/8/30 5:00, Jeff Law 写道:



On 8/28/2021 1:23 AM, Xi Ruoyao wrote:

On Fri, 2021-08-27 at 15:28 -0600, Jeff Law via Gcc-patches wrote:


On 8/26/2021 10:58 PM, YunQiang Su wrote:

for some instructions, MIPS r6 uses different encoding other than
the previous releases.

1. mips/n32.S disable .set mips4: since it casuses old insn encoding
 is used.
 https://github.com/libffi/libffi/pull/396
2. mips/ffi.c: the encoding for JR is hardcoded: we need to use
 different value for r6 and pre-r6.
 https://github.com/libffi/libffi/pull/401

libffi/
 PR libffi/83636
 * src/mips/n32.S: disable .set mips4
 * src/mips/ffi.c: use different JR encoding for r6.

These should go to the upstream libffi project.  Once accepted there
you
can add them to GCC.

Hi Jeff,

The two PRs are already merged, and released since libffi-3.3.0 (now 
the

upstream latest release is 3.4.2).
ACK.  Thanks for confirming.  It's always OK to cherrypick/backport 
from libffi back to GCC.




I don't have a MIPSr6 so I can't test though.
Understood.   Me neither, but I really should get a tiny chroot for 
mipsr6 so that my tester can validate it regularly.




We have port Debian to MIPS64r6el.
http://58.246.137.130:20180/tools/tarball/
You can use both buster(Debian 10) or bullseye (Debian 11).

And both qemu-system and qemu-user can work.
Understood.  I need the minimal chroot to compress down under 100M to 
make gitlab happy (that's where my scripts & jenkins system expect to 
find them).  My scripts to build the chroots also ensure a consistent 
set of packages & version #s across the different systems being 
tested.   I've just never gotten around to building one for mips64r6.  I 
could probably take yours trim out unnecessary stuff and use it for 
now.  Getting a bootstrap test on the target weekly is definitely helpful.


jeff

Re: [PATCH v2] x86-64: Add ABI warning for 64-bit vectors

2021-08-30 Thread Jakub Jelinek via Gcc-patches

On Sun, Aug 29, 2021 at 12:11:23PM -0700, H.J. Lu wrote:
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -1840,6 +1840,54 @@ init_cumulative_args (CUMULATIVE_ARGS *cum,  /* 
> Argument info to initialize */
>cfun->machine->arg_reg_available = (cum->nregs > 0);
>  }
>  
> +/* Return the single 64-bit vector type of TYPE.  */
> +
> +static const_tree
> +single_m64_base_type (const_tree type)
> +{
> +  if ((TREE_CODE (type) == RECORD_TYPE
> +   || TREE_CODE (type) == UNION_TYPE
> +   || TREE_CODE (type) == QUAL_UNION_TYPE)
> +  && int_size_in_bytes (type) == 8)
> +{
> +  const_tree field;
> +  const_tree first_field = nullptr;
> +
> +  for (field = TYPE_FIELDS (type);
> +field;
> +field = DECL_CHAIN (field))
> + if (TREE_CODE (field) == FIELD_DECL)
> +   {
> + if (TREE_TYPE (field) == error_mark_node)
> +   continue;
> +
> + /* Skip if structure has more than one field.  */
> + if (first_field)
> +   return nullptr;
> +
> + first_field = field;
> +   }
> +
> +  /* Skip if structure doesn't have any fields.  */
> +  if (!first_field)
> + return nullptr;

Is this an attempt to emulate compute_record_mode or something else?
How should it treat zero-width bitfields (either the C ones kept
in the structures or C++ ones formerly removed from them and now no longer)?
compute_record_mode actually has more complicated details...

Jakub

rs6000: Backports of mode promote patches to GCC 11

2021-08-30 Thread Segher Boessenkool

Hi!

I have backported 9080a3bf2329, a3f6bd789149, and f0529d96f567 to the
GCC 11 branch.  This solves PR102062, but is a >2% performance win for
such a trivial patch, too, which is enough reason on its own :-)


Segher

[PING #2][PATCH] enable ranger and caching in pass_waccess

2021-08-30 Thread Martin Sebor via Gcc-patches


Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578135.html

Are there any further comments on the patch?

Richard, you were kind enough to review the first two patches in this
series.  Would you mind doing the same for this one?  It continues in
the same vein.

Martin

On 8/25/21 3:26 PM, Martin Sebor wrote:

On 8/25/21 10:14 AM, Andrew MacLeod wrote:

On 8/25/21 11:20 AM, Martin Sebor wrote:

Ping: Andrew, did I answer your questions?  Do you (or anyone else)
have any other comments on the latest patch below?

https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577865.html



I wasn't attempting to block it, its outside my review purview..

I was merely commenting that you should not need a pointer to a 
range_query at all anymore









Are you planning to transition to using the get_range_query() 
interface instead of keeping a range_query pointer in the 
pointer_query class?


This pass and to a smaller extent the pointer_query class that's
used by it and the strlen pass are still a work in progress.
I also still need to convert the strlen pass to use Ranger and
I expect it will take some changes to pointer_query.  So at that
point, if going through get_range_query (cfun) everywhere is what
you recommend, I'm happy to do it.

absolutely. you should not need to even know whether you have a ranger 
instance running or not. get_range_query will give you whichever is 
active, and there is ALWAYS something active.. defaulting to the 
global version.


the code in get_range() seems to be the end of the call chain which 
uses the pointer and should be consolidated to something much simpler


  if (rvals && stmt)
     {
   if (!rvals->range_of_expr (vr, val, stmt))
     return NULL_TREE;

     <...>

   // ?? This entire function should use get_range_query or
    get_global_range_query (),
   // instead of doing something different for RVALS and global 
ranges.


   if (!get_global_range_query ()->range_of_expr (vr, val) ||
    vr.undefined_p ())
     return NULL_TREE;


This entire section can boil down to something like

if (!get_range_query (cfun)->range_of_expr (vr, val, stmt))
   return NULL;


So get_range_query (cfun) will never be null?  And no special handling
of INTEGER_CST is needed, or checking for SSA_NAME.  Nice.  Attached
is another revision of the same patch with this function simplified.

(FYI: I expect the whole function to eventually go away, and to make
other similar simplifications, but I'm not there yet.)


PS There has been an effort to get rid of global variables from GCC,
or, as the first step, to avoid accessing them directly(*).  If and
when that happens, it seems like each pass will have to store either
the ranger instance as a member (directly or indirectly, via a member
of a class that stores it) or the function passed to pass::execute()
if it wants to access either.

[*] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573975.html
The patch at the link above wasn't approved but IIUC removing globals
from GCC is still a goal.


I have no idea what direction that is going, but the net effect will 
be the same in the end.  You'll be calling get_range_query() with 
either the function pointer, or the pass pointer, or something.. but 
you should never need to pass around  a pointer to either a ranger or 
range-query. As I said earlier, this will likely be a pass property 
and accessed thru the pass eventually... but until then, use what we 
have.. cfun.  It'll be trivial to transition down the road to whatever 
the ultimate solution is.


Okay, I'll do that in my future changes.

Martin

[committed] libphobos: Compile configure tests with -fno-druntime

2021-08-30 Thread Iain Buclaw via Gcc-patches

Hi,

This patch changes WITH_LOCAL_DRUNTIME to build with `-fno-druntime'.
The D tests done at configure-time for libphobos don't require a
functional D run-time, so don't enable any run-time features.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* configure: Regenerate.
* m4/autoconf.m4 (AC_LANG_PROGRAM): Declare module name 'object'.
* m4/gcc_support.m4 (WITH_LOCAL_DRUNTIME): Compile tests with
-fno-druntime.
---
 libphobos/configure | 28 ++--
 libphobos/m4/autoconf.m4|  2 +-
 libphobos/m4/gcc_support.m4 |  2 +-
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/libphobos/configure b/libphobos/configure
index b1c8ecb5673..14298a0dc4f 100755
--- a/libphobos/configure
+++ b/libphobos/configure
@@ -5382,7 +5382,7 @@ fi
 
 
   gdc_save_DFLAGS=$GDCFLAGS
-  GDCFLAGS="-fno-moduleinfo -nostdinc -I $phobos_cv_abs_srcdir/libdruntime  
$GDCFLAGS"
+  GDCFLAGS="-fno-druntime -nostdinc -I $phobos_cv_abs_srcdir/libdruntime  
$GDCFLAGS"
 
   ac_ext=d
 ac_compile='$GDC -c $GDCFLAGS conftest.$ac_ext >&5'
@@ -5392,7 +5392,7 @@ ac_compiler_gnu=yes
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking If $GDC can compile D 
sources" >&5
 $as_echo_n "checking If $GDC can compile D sources... " >&6; }
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
 
 
 extern(C) int main() {
@@ -12097,7 +12097,7 @@ CC="$lt_save_CC"
 
 
   gdc_save_DFLAGS=$GDCFLAGS
-  GDCFLAGS="-fno-moduleinfo -nostdinc -I $phobos_cv_abs_srcdir/libdruntime  
$GDCFLAGS"
+  GDCFLAGS="-fno-druntime -nostdinc -I $phobos_cv_abs_srcdir/libdruntime  
$GDCFLAGS"
 
 
 # Source file extension for D test sources.
@@ -14089,7 +14089,7 @@ fi
 
 
   gdc_save_DFLAGS=$GDCFLAGS
-  GDCFLAGS="-fno-moduleinfo -nostdinc -I $phobos_cv_abs_srcdir/libdruntime 
-nophoboslib $GDCFLAGS"
+  GDCFLAGS="-fno-druntime -nostdinc -I $phobos_cv_abs_srcdir/libdruntime 
-nophoboslib $GDCFLAGS"
 
 ac_ext=d
 ac_compile='$GDC -c $GDCFLAGS conftest.$ac_ext >&5'
@@ -14098,7 +14098,7 @@ ac_compiler_gnu=yes
 
 GDCFLAGS="$GDCFLAGS -g -Werror -ffunction-sections -fdata-sections"
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
 int foo; void bar() { }
 
 extern(C) int main() {
@@ -14562,7 +14562,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
 
 
   gdc_save_DFLAGS=$GDCFLAGS
-  GDCFLAGS="-fno-moduleinfo -nostdinc -I $phobos_cv_abs_srcdir/libdruntime 
-nophoboslib $GDCFLAGS"
+  GDCFLAGS="-fno-druntime -nostdinc -I $phobos_cv_abs_srcdir/libdruntime 
-nophoboslib $GDCFLAGS"
 
   ac_ext=d
 ac_compile='$GDC -c $GDCFLAGS conftest.$ac_ext >&5'
@@ -14576,7 +14576,7 @@ if ${ac_cv_search_malloc+:} false; then :
 else
   ac_func_search_save_LIBS=$LIBS
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
  extern(C) int malloc();
 
 extern(C) int main() {
@@ -14622,7 +14622,7 @@ if ${ac_cv_search_pthread_create+:} false; then :
 else
   ac_func_search_save_LIBS=$LIBS
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
  extern(C) int pthread_create();
 
 extern(C) int main() {
@@ -14668,7 +14668,7 @@ if ${ac_cv_search_cosf+:} false; then :
 else
   ac_func_search_save_LIBS=$LIBS
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
  extern(C) int cosf();
 
 extern(C) int main() {
@@ -14714,7 +14714,7 @@ if ${ac_cv_search_clock_gettime+:} false; then :
 else
   ac_func_search_save_LIBS=$LIBS
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
  extern(C) int clock_gettime();
 
 extern(C) int main() {
@@ -14765,7 +14765,7 @@ $as_echo_n "checking for atomic builtins for byte... " 
>&6; }
 else
 
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
 import gcc.builtins;
 
 extern(C) int main() {
@@ -14799,7 +14799,7 @@ $as_echo_n "checking for atomic builtins for short... " 
>&6; }
 else
 
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
 import gcc.builtins;
 
 extern(C) int main() {
@@ -14833,7 +14833,7 @@ $as_echo_n "checking for atomic builtins for int... " 
>&6; }
 else
 
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
 import gcc.builtins;
 
 extern(C) int main() {
@@ -14867,7 +14867,7 @@ $as_echo_n "checking for atomic builtins for long... " 
>&6; }
 else
 
 cat > conftest.$ac_ext <<_ACEOF
-module mod;
+module object;
 import gcc.builtins;
 
 extern(C) int main() {
diff --git a/libphobos/m4/autoconf.m4 b/libphobos/m4/autoconf.m4
index f46a7809f43..f0ca947478c 100644
--- a/libphobos/m4/autoconf.m4
+++ b/libphobos/m4/autoconf.m4
@@ -27,7 +27,7 @@ AU_DEFUN([AC_LANG_D], [AC_LANG(D)])
 # AC_LANG_PROGRAM(D)([PROLOGUE], [BODY])
 # ---
 m4_define([AC_LANG_PROGRAM(D)],
-[module mod;
+[module object;
 $1
 
 extern(C) int main() {
diff --git a/libphobos/m4/gcc_support.m4 b/libphobos/m4/gcc_support.m4
index 0903ed4b45f..cc1acb4a230 100644
--- a/libphobos/m4/gcc_support.m4
+++ b/libphobos/m4/

[PATCH] c++: New way to sanitize destructor calls [PR101355]

2021-08-30 Thread Dan Klishch via Gcc-patches


Hi!

The following patch changes the way destructor calls are sanitized. Currently,
they are sanitized just like any other member calls, transforming `a::~a(this)'
to `a::~a(.UBSAN_NULL(SAVE_EXPR(this)), SAVE_EXPR(this))'. However, this is a
problem for coroutines. In some cases, a destructor call is cloned to another
location, so then the saved expression is not evaluated before the second call
to the destructor.

The patch gets rid of SAVE_EXPRs by creating a temporary variable, so the calls
are transformed to something resembling:
{
  void *ptr;
  ptr = this;
  .UBSAN_NULL(ptr);
  a::~a(ptr);
}
Unfortunately, the implementation is not so straight-forward. Firstly, we do not
want to sanitize again already sanitized call. I accomplish this by creating a
hash set of instrumented expressions on the way to the tree's root. Secondly,
the affected calls are cloned, so I need to explicitly copy the tree node to
change it's argument. I am not sure if the proposed solution is optimal and will
not cause the creation of an exponential number of nodes.

The patch survives bootstrapping and regression testing on x86_64-pc-linux-gnu.
If approved, I would like to note that I do not have write access to the
repository and this is my first contribution to GCC.

2021-08-30  Dan Klishch  

PR c++/101355
gcc/testsuite/
* g++.dg/coroutines/pr101355.C: New test.

gcc/c-family/
* c-ubsan.h (ubsan_maybe_instrument_member_call): Change prototype.
* c-ubsan.c: Add new include.
(ubsan_should_instrument_reference_or_call): Add new function.
(ubsan_maybe_instrument_reference_or_call): Add possibility to convert
call-expr to bind-expr, so it would not use save-expr.
(ubsan_maybe_instrument_member_call): Add flag to allow using new
functionality of the previous.
(ubsan_maybe_instrument_reference): Fix arguments of call to
ubsan_maybe_instrument_reference_or_call.

gcc/cp/
* cp-gimplify.c (struct cp_genericize_data): Add instrumented_dtor_set
field.
(cp_genericize_r): Tell ubsan to instrument destructors without using
save-expr.
(cp_genericize_tree): Add initialization of instrumented_dtor_set.


diff --git a/gcc/c-family/c-ubsan.c b/gcc/c-family/c-ubsan.c
index 12a7bca5c32..593836984a5 100644
--- a/gcc/c-family/c-ubsan.c
+++ b/gcc/c-family/c-ubsan.c
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "tm.h"
 #include "c-family/c-common.h"
+#include "tree-iterator.h"
 #include "ubsan.h"
 #include "c-family/c-ubsan.h"
 #include "stor-layout.h"
@@ -398,22 +399,15 @@ ubsan_maybe_instrument_array_ref (tree *expr_p, bool 
ignore_off_by_one)
 }
 }

-static tree
-ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree ptype,
- enum ubsan_null_ckind ckind)
+static bool
+ubsan_should_instrument_reference_or_call (tree op, tree ptype,
+  unsigned int &mina)
 {
-  if (!sanitize_flags_p (SANITIZE_ALIGNMENT | SANITIZE_NULL)
-  || current_function_decl == NULL_TREE)
-return NULL_TREE;
-
-  tree type = TREE_TYPE (ptype);
-  tree orig_op = op;
   bool instrument = false;
-  unsigned int mina = 0;

   if (sanitize_flags_p (SANITIZE_ALIGNMENT))
 {
-  mina = min_align_of_type (type);
+  mina = min_align_of_type (TREE_TYPE (ptype));
   if (mina <= 1)
mina = 0;
 }
@@ -454,19 +448,68 @@ ubsan_maybe_instrument_reference_or_call (location_t loc, 
tree op, tree ptype,
instrument = true;
}
 }
-  if (!instrument)
+  return instrument;
+}
+
+static tree
+ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree ptype,
+ enum ubsan_null_ckind ckind,
+ bool use_save_expr, tree *stmt_p,
+ tree *copied_call)
+{
+  if (!sanitize_flags_p (SANITIZE_ALIGNMENT | SANITIZE_NULL)
+  || current_function_decl == NULL_TREE)
 return NULL_TREE;
-  op = save_expr (orig_op);
+
+  unsigned int mina = 0;
+  if (!ubsan_should_instrument_reference_or_call (op, ptype, mina))
+return NULL_TREE;
+
   gcc_assert (POINTER_TYPE_P (ptype));
   if (TREE_CODE (ptype) == REFERENCE_TYPE)
 ptype = build_pointer_type (TREE_TYPE (ptype));
   tree kind = build_int_cst (ptype, ckind);
   tree align = build_int_cst (pointer_sized_int_node, mina);
-  tree call
-= build_call_expr_internal_loc (loc, IFN_UBSAN_NULL, void_type_node,
-   3, op, kind, align);
-  TREE_SIDE_EFFECTS (call) = 1;
-  return fold_build2 (COMPOUND_EXPR, TREE_TYPE (op), call, op);
+
+  if (use_save_expr)
+{
+  op = save_expr (op);
+  tree call = build_call_expr_internal_loc (
+ loc, IFN_UBSAN_NULL, void_type_node, 3, op, kind, align);
+  TREE_SIDE_EFFECTS (call) = 1;
+  return fold_build2 (COMPOUND_

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-30 Thread Joseph Myers

This commit introduces an ICE building libgcc for 32-bit SPARC.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] libiberty, configure, Darwin: Avoid detecting deprecated sbrk.

2021-08-30 Thread Iain Sandoe

Hi,

Darwin provides an implementation of sbrk, which is detected by the
libiberty configuration process.

However, (like most of the BSD-derivatives) sbrk/brk are deprecated on
Darwin which leads to build-time warnings.  It seems that the configure
process does not see the deprecation warnings as reason for excluding
the fn.

Darwin should use the malloc-based implementation.

This patch works around the issue by removing sbrk from the functions
searched (for Darwin only, although it’s likely that other BSD-ish ports
might wish to do the same).

Open to more elegant solutions, of course,
tested on powerpc,i686,x86_64-darwin, x86_64, powerpc64- linux,

OK for master?
thanks Iain

Signed-off-by: Iain Sandoe 

libiberty/ChangeLog:

* configure: Regenerate.
* configure.ac: Do not search for sbrk on Darwin.
* xmalloc.c: Do not declare sbrk unless it has been found
by configure.
---
 libiberty/configure| 43 --
 libiberty/configure.ac | 15 +--
 libiberty/xmalloc.c|  2 ++
 3 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index a85ff25501a..4b78c1830c7 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -395,9 +395,16 @@ vars="sys_errlist sys_nerr sys_siglist"
 
 checkfuncs="__fsetlocking canonicalize_file_name dup3 getrlimit getrusage \
  getsysinfo gettimeofday on_exit pipe2 psignal pstat_getdynamic 
pstat_getstatic \
- realpath setrlimit sbrk spawnve spawnvpe strerror strsignal sysconf sysctl \
+ realpath setrlimit spawnve spawnvpe strerror strsignal sysconf sysctl \
  sysmp table times wait3 wait4"
 
+# Darwin has sbrk, but it is deprecated and that produces build-time warnings
+# so do not check for it.
+case "${host}" in
+  *-*-darwin*) ;;
+  *) checkfuncs="$checkfuncs sbrk"
+esac
+
 # These are neither executed nor required, but they help keep
 # autoheader happy without adding a bunch of text to acconfig.h.
 if test "x" = "y"; then
@@ -695,7 +702,11 @@ if test -z "${setobjs}"; then
 
   AC_CHECK_FUNCS($checkfuncs)
   AC_CHECK_DECLS([basename(char *), ffs, asprintf, vasprintf, snprintf, 
vsnprintf])
-  AC_CHECK_DECLS([calloc, getenv, getopt, malloc, realloc, sbrk])
+  AC_CHECK_DECLS([calloc, getenv, getopt, malloc, realloc])
+  case "${host}" in
+  *-*-darwin*) ;; # Darwin's sbrk implementation is deprecated.
+  *) AC_CHECK_DECLS([sbrk]);;
+  esac
   AC_CHECK_DECLS([strtol, strtoul, strtoll, strtoull])
   AC_CHECK_DECLS([strverscmp])
   AC_CHECK_DECLS([strnlen])
diff --git a/libiberty/xmalloc.c b/libiberty/xmalloc.c
index 85826c1e10f..b9e23c38643 100644
--- a/libiberty/xmalloc.c
+++ b/libiberty/xmalloc.c
@@ -87,7 +87,9 @@ extern "C" {
 void *malloc (size_t);
 void *realloc (void *, size_t);
 void *calloc (size_t, size_t);
+#ifdef HAVE_SBRK
 void *sbrk (ptrdiff_t);
+#endif
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */

[committed] hppa: Fix libgfortran build on hppa-hp-hpux[01]

2021-08-30 Thread John David Anglin

The following change fixes the build of libgfortran on hppa*-hp-hpux[01]*.

Tested on hppa64-hp-hpux11.11 and hppa2.0w-hp-hpux11.11.  Committed to trunk.

Dave
---

Fix libgfortran build on hppa*-hp-hpux[01]*

Add include hack to define PRIdPTR, PRIiPTR, PRIoPTR, PRIuPTR, PRIxPTR
and PRIXPTR in inttypes.h.

2021-08-30  John David Anglin  

fixincludes/ChangeLog:
* inclhack.def (hpux_c99_inttypes5): New hack to define PRIdPTR, etc.
* fixincl.x: Regenerate.
* tests/base/inttypes.h: Update.

diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def
index b7ad6982e96..46e3b8c993a 100644
--- a/fixincludes/inclhack.def
+++ b/fixincludes/inclhack.def
@@ -2668,6 +2668,34 @@ fix = {
"#define SCNxMAX SCNx32\n";
 };

+/*
+ * Fix missing PRIdPTR defines in inttypes.h
+ */
+fix = {
+hackname  = hpux_c99_inttypes5;
+mach  = "hppa*-hp-hpux11.[01]*";
+files = inttypes.h;
+select= "#ifndef[ \t]_KERNEL[ \t]*\n";
+c_fix = format;
+c_fix_arg = "#ifndef __LP64__\n"
+   "#define PRIdPTR\t\tPRId32\n"
+   "#define PRIiPTR\t\tPRIi32\n"
+   "#define PRIoPTR\t\tPRIo32\n"
+   "#define PRIuPTR\t\tPRIu32\n"
+   "#define PRIxPTR\t\tPRIx32\n"
+   "#define PRIXPTR\t\tPRIX32\n"
+   "#else\n"
+   "#define PRIdPTR\t\tPRId64\n"
+   "#define PRIiPTR\t\tPRIi64\n"
+   "#define PRIoPTR\t\tPRIo64\n"
+   "#define PRIuPTR\t\tPRIu64\n"
+   "#define PRIxPTR\t\tPRIx64\n"
+   "#define PRIXPTR\t\tPRIX64\n"
+   "#endif\n\n"
+   "#ifndef _KERNEL\n";
+test_text = "#ifndef _KERNEL\n";
+};
+
 /*
  *  Fix hpux broken ctype macros
  */
diff --git a/fixincludes/tests/base/inttypes.h 
b/fixincludes/tests/base/inttypes.h
index 144ea6596e8..9c1f93eb736 100644
--- a/fixincludes/tests/base/inttypes.h
+++ b/fixincludes/tests/base/inttypes.h
@@ -42,3 +42,25 @@
 #define SCNuMAX SCNu32

 #endif  /* HPUX_C99_INTTYPES4_CHECK */
+
+
+#if defined( HPUX_C99_INTTYPES5_CHECK )
+#ifndef __LP64__
+#define PRIdPTRPRId32
+#define PRIiPTRPRIi32
+#define PRIoPTRPRIo32
+#define PRIuPTRPRIu32
+#define PRIxPTRPRIx32
+#define PRIXPTRPRIX32
+#else
+#define PRIdPTRPRId64
+#define PRIiPTRPRIi64
+#define PRIoPTRPRIo64
+#define PRIuPTRPRIu64
+#define PRIxPTRPRIx64
+#define PRIXPTRPRIX64
+#endif
+
+#ifndef _KERNEL
+
+#endif  /* HPUX_C99_INTTYPES5_CHECK */

[PATCH] gdb: Add a dependency between gdb and libbacktrace

2021-08-30 Thread Andrew Burgess

I plan to make use of libbacktrace within GDB.  I believe that the
patch below needs to be merged into GCCs toplevel directory and then
back-ported to the binutils-gdb repository.

Is this OK to merge?

Thanks,
Andrew

---

GDB is going to start using libbacktrace, so add a build dependency
between the two projects.  This change needs to be added into the GCC
toplevel files, and then back-ported to the binutils-gdb repository.

ChangeLog:

* Makefile.def: Add all-gdb dependency on all-libbacktrace.
* Makefile.in: Regenerate.
---
 ChangeLog| 5 +
 Makefile.def | 1 +
 Makefile.in  | 1 +
 3 files changed, 7 insertions(+)

diff --git a/Makefile.def b/Makefile.def
index fbfdb6fee08..de3e0052106 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -423,6 +423,7 @@ dependencies = { module=all-gdb; on=all-sim; };
 dependencies = { module=all-gdb; on=all-libdecnumber; };
 dependencies = { module=all-gdb; on=all-libtermcap; };
 dependencies = { module=all-gdb; on=all-libctf; };
+dependencies = { module=all-gdb; on=all-libbacktrace; };
 
 // Host modules specific to gdbserver.
 dependencies = { module=configure-gdbserver; on=all-gnulib; };
diff --git a/Makefile.in b/Makefile.in
index 5c85fcc5dd0..61af99dc75a 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -61237,6 +61237,7 @@ all-gdb: maybe-all-libiconv
 all-gdb: maybe-all-opcodes
 all-gdb: maybe-all-libdecnumber
 all-gdb: maybe-all-libctf
+all-gdb: maybe-all-libbacktrace
 all-gdbserver: maybe-all-libiberty
 configure-gdbsupport: maybe-configure-intl
 all-gdbsupport: maybe-all-intl
-- 
2.25.4

Re: [EXTERNAL] Re: [PATCH] Propagate get_nonzero_bits information in division [PR77980]

2021-08-30 Thread Victor Tong via Gcc-patches

Thanks Jeff. I've reached out to Roger to see if the fix would be better suited 
in CCP. If that isn't the right spot, I'll reach out to Aldy and Andrew about 
getting the fix in VRP/Ranger.


From: Jeff Law 
Sent: Sunday, August 22, 2021 8:11 PM
To: Victor Tong ; gcc-patches@gcc.gnu.org 
; pins...@gcc.gnu.org 
Subject: [EXTERNAL] Re: [PATCH] Propagate get_nonzero_bits information in 
division [PR77980] 
 


On 7/26/2021 6:45 PM, Victor Tong via Gcc-patches wrote:
> This change enables the "t1 != 0" check to be optimized away in this code:
>
> int x1 = 0;
> unsigned int x2 = 1;
>
> int main ()
> {
>  int t1 = x1*(1/(x2+x2));
>  if (t1 != 0) __builtin_abort();
>  return 0;
> }
>
> The change utilizes the VRP framework to propagate the get_nonzero_bits 
> information from the "x2+x2" expression to the "1/(x2+x2)" division 
> expression. Specifically, the framework knows that the least significant bit 
> of the "x2+x2" expression must be zero.
>
> The get_nonzero_bits information of the left hand side and right hand side of 
> expressions needed to be passed down to operator_div::wi_fold() in the VRP 
> framework. The majority of this change involves adding two additional 
> parameters to propagate this information. There are future opportunities to 
> use the non zero bit information to perform better optimizations in other 
> types of expressions.
>
> The changes were tested against x86_64-pc-linux-gnu and all tests in "make -k 
> check" passed.
>
> The original approach was to implement a match.pd pattern to support this but 
> the pattern wasn't being triggered. More context is available in: 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D77980&data=04%7C01%7Cvitong%40microsoft.com%7C08dcd27e5482418d559708d965e3a826%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637652850726534114%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=S1WWWvuXwEIA20pmv%2Fn8kk6xlwv9VwsS%2BTl0jqEHf4w%3D&reserved=0
So you're going to want to sync with Aldy & Andrew as they're the 
experts on the Ranger design & implementation.  This hits the Ranger API 
as well as design questions about how best to tie in the nonzero_bits 
capabilities.

You might also want to reach out to Roger Sayle.  He's been poking 
around in a closely related area, though more focused on the bitwise 
conditional constant propagation rather than Ranger/VRP.  In fact, I 
just acked a patch of his that looks closely related.

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2021-August%2F577888.html&data=04%7C01%7Cvitong%40microsoft.com%7C08dcd27e5482418d559708d965e3a826%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637652850726534114%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gn1L1exTW5TjdV8CDBDqB0Z5a4V7EP2tBM2uTF2ihck%3D&reserved=0

Jeff

[pushed] c++: Add warning about missing 'requires'

2021-08-30 Thread Jason Merrill via Gcc-patches

I noticed that concepts-lambda14.C had two useless requires-expressions:

  static_assert(requires { C; });

always succeeds, because C is always a valid expression for any type,
regardless of whether C is satisfied for a particular type.  Presumably the
user means

  static_assert(requires { requires C; });

to make the C a nested-requirement.  Of course,

  static_assert(C);

is much simpler and means the same thing; this is more relevant in the
middle of a longer requires-expression, such as the bug this warning found
in cmcstl2:

  template
  META_CONCEPT input_iterator =
  input_or_output_iterator &&
  readable &&
  requires(I& i, const I& ci) {
  typename iterator_category_t;
  derived_from, input_iterator_tag>;
  i++;
  };

where 'requires' is missing before 'derived_from'.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wmissing-requires.

gcc/c-family/ChangeLog:

* c.opt: Add -Wmissing-requires.

gcc/cp/ChangeLog:

* parser.c (cp_parser_simple_requirement): Warn about missing
requires.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-lambda14.C: Add expected warnings.
---
 gcc/doc/invoke.texi   | 22 +++
 gcc/c-family/c.opt|  4 
 gcc/cp/parser.c   | 19 
 .../g++.dg/cpp2a/concepts-lambda14.C  |  4 ++--
 4 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f7bb193b51d..8969bac664d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8752,6 +8752,28 @@ s x = @{ @};
 This warning is included in @option{-Wextra}.  To get other @option{-Wextra}
 warnings without this one, use @option{-Wextra 
-Wno-missing-field-initializers}.
 
+@item -Wno-missing-requires
+@opindex Wmissing-requires
+@opindex Wno-missing-requires
+
+By default, the compiler warns about a concept-id appearing as a C++20 
simple-requirement:
+
+@smallexample
+bool satisfied = requires @{ C @};
+@end smallexample
+
+Here @samp{satisfied} will be true if @samp{C} is a valid
+expression, which it is for all T.  Presumably the user meant to write
+
+@smallexample
+bool satisfied = requires @{ requires C @};
+@end smallexample
+
+so @samp{satisfied} is only true if concept @samp{C} is satisfied for
+type @samp{T}.
+
+This warning can be disabled with @option{-Wno-missing-requires}.
+
 @item -Wno-multichar
 @opindex Wno-multichar
 @opindex Wmultichar
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 91929706aff..c5fe90003f2 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -839,6 +839,10 @@ Wmissing-field-initializers
 C ObjC C++ ObjC++ Var(warn_missing_field_initializers) Warning 
EnabledBy(Wextra)
 Warn about missing fields in struct initializers.
 
+Wmissing-requires
+C++ ObjC++ Var(warn_missing_requires) Init(1) Warning
+Warn about likely missing requires keyword.
+
 Wmultistatement-macros
 C ObjC C++ ObjC++ Var(warn_multistatement_macros) Warning LangEnabledBy(C ObjC 
C++ ObjC++,Wall)
 Warn about unsafe macros expanding to multiple statements used as a body of a 
clause such as if, else, while, switch, or for.
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index a959c71dfa3..797e70ba5bb 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -29911,6 +29911,25 @@ cp_parser_simple_requirement (cp_parser *parser)
   if (expr.get_location() == UNKNOWN_LOCATION)
 expr.set_location (start);
 
+  for (tree t = expr; ; )
+{
+  if (TREE_CODE (t) == TRUTH_ANDIF_EXPR
+ || TREE_CODE (t) == TRUTH_ORIF_EXPR)
+   {
+ t = TREE_OPERAND (t, 0);
+ continue;
+   }
+  if (concept_check_p (t))
+   {
+ gcc_rich_location richloc (get_start (start));
+ richloc.add_fixit_insert_before (start, "requires ");
+ warning_at (&richloc, OPT_Wmissing_requires, "testing "
+ "if a concept-id is a valid expression; add "
+ "% to check satisfaction");
+   }
+  break;
+}
+
   return finish_simple_requirement (expr.get_location (), expr);
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda14.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda14.C
index bdc893da857..02b6b6a8438 100644
--- a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda14.C
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda14.C
@@ -11,9 +11,9 @@ void foo() noexcept(!__is_same(T, void)) { }
 
 template
 auto f() {
-  return [](T, bool a = requires { C; }){
+  return [](T, bool a = requires { C; }){ // { dg-warning 
Wmissing-requires }
 static_assert(requires { requires C && (C || C); }); // { 
dg-error "assert" }
-static_assert(requires { C; });
+static_assert(requires { C; }); // { dg-warning Wmissing-requires }
 static_assert(requires { { foo() } noexcept -> C; });
 static_assert(!requires { type

[PATCH] Fix PR 90142: contrib/download_prerequisites uses test ==

2021-08-30 Thread apinski--- via Gcc-patches

From: Andrew Pinski 

Since == is not portable, it is better to use = in contrib/
download_prerequisites.  The only place == was used is inside
the function md5_check which is used only on Mac OS X.

Tested on Mac OS X as:
./contrib/download_prerequisites --md5
Both with all files having the correct checksum and one with a broken one.

contrib/ChangeLog:

* download_prerequisites (md5_check): Replace == inside
test with = to be more portable.
---
 contrib/download_prerequisites | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites
index 8f69b61f5a9..11c283ecb1a 100755
--- a/contrib/download_prerequisites
+++ b/contrib/download_prerequisites
@@ -113,7 +113,7 @@ md5_check() {
   md5_checksum_output=$(md5 -r "${file_to_check}")
   # Grab the text before the first space
   md5_checksum_detected="${md5_checksum_output%% *}"
-  [ "${md5_checksum_expected}" == "${md5_checksum_detected}" ] \
+  [ "${md5_checksum_expected}" = "${md5_checksum_detected}" ] \
 || die "Cannot verify integrity of possibly corrupted file 
${file_to_check}"
   echo "${file_to_check}: OK"
 }
-- 
2.17.1

[PATCH, committed] PR fortran/101349 - ICE in gfc_get_descriptor_field, at fortran/trans-array.c:140

2021-08-30 Thread Harald Anlauf via Gcc-patches

The check of arguments to ALLOCATE did not properly implement
F2008:C628 / F2018:C932, as it excluded unlimited polymophics,
in contrast to the standard text.  Fix this.

Committed Steve's patch as obvious after regtesting on x86_64-pc-linux-gnu.

Thanks,
Harald


Fortran - correct check for constraint F2008:C628 / F2018:C932

gcc/fortran/ChangeLog:

PR fortran/101349
* resolve.c (resolve_allocate_expr): An unlimited polymorphic
argument to ALLOCATE must be ALLOCATABLE or a POINTER.  Fix the
corresponding check.

gcc/testsuite/ChangeLog:

PR fortran/101349
* gfortran.dg/unlimited_polymorphic_33.f90: New test.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index f641d0d4dae..d7aa2868cef 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -7829,8 +7829,9 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code, bool *array_alloc_wo_spec)
 	}
 }

-  /* Check for F08:C628.  */
-  if (allocatable == 0 && pointer == 0 && !unlimited)
+  /* Check for F08:C628 (F2018:C932).  Each allocate-object shall be a data
+ pointer or an allocatable variable.  */
+  if (allocatable == 0 && pointer == 0)
 {
   gfc_error ("Allocate-object at %L must be ALLOCATABLE or a POINTER",
 		 &e->where);
diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_33.f90 b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_33.f90
new file mode 100644
index 000..8bb8864cb26
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_33.f90
@@ -0,0 +1,18 @@
+! { dg-do compile }
+! PR fortran/101349 - ICE in gfc_get_descriptor_field
+! Check constraint F2008:C628 / F2018:C932
+
+subroutine s(x)
+  class(*) :: x(:)
+  allocate (x, source=['abc']) ! { dg-error "must be ALLOCATABLE or a POINTER" }
+end
+
+subroutine t(x)
+  class(*), allocatable :: x(:)
+  allocate (x, source=['abc'])
+end
+
+subroutine u(x)
+  class(*), pointer :: x(:)
+  allocate (x, source=['abc'])
+end

[PATCH, committed] PR fortran/102113 - parsing error in assigned goto

2021-08-30 Thread Harald Anlauf via Gcc-patches

A whitespace issue during parsing.

Committed Steve's patch as obvious after regtesting on x86_64-pc-linux-gnu.

Thanks,
Harald


Fortran - fix whitespace issue during parsing of assigned goto

gcc/fortran/ChangeLog:

PR fortran/102113
* match.c (gfc_match_goto): Allow for whitespace in parsing list
of labels.

gcc/testsuite/ChangeLog:

PR fortran/102113
* gfortran.dg/goto_9.f90: New test.

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-30 Thread Jeff Law via Gcc-patches





On 8/30/2021 1:09 PM, Joseph Myers wrote:

This commit introduces an ICE building libgcc for 32-bit SPARC.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133
I've sent Hongtao a testcase which is ICE-ing.  It was mcore-elf, but I 
saw similar failures on a half-dozen architectures before I reverted the 
change locally.


jeff

Re: [PATCH v3 1/6] rs6000: Support SSE4.1 "round" intrinsics

2021-08-30 Thread Paul A. Clarke via Gcc-patches

On Fri, Aug 27, 2021 at 08:44:43AM -0500, Bill Schmidt via Gcc-patches wrote:
> On 8/23/21 2:03 PM, Paul A. Clarke wrote:
> > +   __fpscr_save.__fr = __builtin_mffsl ();
> 
> As pointed out in the v1 review, __builtin_mffsl is enabled (or supposed to
> be) only for POWER9 and later.  This will fail to work on POWER8 and earlier
> when the new builtins support is complete and this is enforced more
> carefully.  Please #ifdef and use __builtin_mffs on earlier processors. 
> Please do this everywhere this occurs.
> 
> I think you got some contradictory guidance on this, but trust me, this will
> break.

The confusing thing is that _builtin_mffsl is explicitly supported on earlier
processors, if I read the code right (from gcc/config/rs6000/rs6000.md):
--
(define_expand "rs6000_mffsl"
  [(set (match_operand:DF 0 "gpc_reg_operand")
(unspec_volatile:DF [(const_int 0)] UNSPECV_MFFSL))]
  "TARGET_HARD_FLOAT"
{
  /* If the low latency mffsl instruction (ISA 3.0) is available use it,
 otherwise fall back to the older mffs instruction to emulate the mffsl
 instruction.  */
  
  if (!TARGET_P9_MISC)
{
  rtx tmp1 = gen_reg_rtx (DFmode);

  /* The mffs instruction reads the entire FPSCR.  Emulate the mffsl 
 instruction using the mffs instruction and masking the result.  */
  emit_insn (gen_rs6000_mffs (tmp1));
...
--

Is that going away?  If so, that would be a possible (undesirable?)
API change, no?

PC

Re: [PATCH] Fix x86/56337 : 1<<28 alignment is broken

2021-08-30 Thread Andrew Pinski via Gcc-patches

On Fri, Jul 23, 2021 at 1:33 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> The problem here is the x86_64 back-end uses a signed integer
> for alignment and then divides by BITS_PER_UNIT so if we had
> INT_MIN (which is what 1<<28*8 is), we would get the wrong result.
>
> This fixes the problem by using unsigned for the argument to
> x86_output_aligned_bss and x86_output_aligned_bss.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu.

Ping?

>
> gcc/ChangeLog:
>
> PR target/56337
> * config/i386/i386-protos.h (x86_output_aligned_bss):
> Change align argument to unsigned type.
> (x86_elf_aligned_decl_common): Likewise.
> * config/i386/i386.c (x86_elf_aligned_decl_common): Likewise.
> (x86_output_aligned_bss): Likewise.
> ---
>  gcc/config/i386/i386-protos.h | 4 ++--
>  gcc/config/i386/i386.c| 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> index 07ac02aff69..fa1b0d0d787 100644
> --- a/gcc/config/i386/i386-protos.h
> +++ b/gcc/config/i386/i386-protos.h
> @@ -325,9 +325,9 @@ struct ix86_address
>  extern int ix86_decompose_address (rtx, struct ix86_address *);
>  extern int memory_address_length (rtx, bool);
>  extern void x86_output_aligned_bss (FILE *, tree, const char *,
> -   unsigned HOST_WIDE_INT, int);
> +   unsigned HOST_WIDE_INT, unsigned);
>  extern void x86_elf_aligned_decl_common (FILE *, tree, const char *,
> -unsigned HOST_WIDE_INT, int);
> +unsigned HOST_WIDE_INT, unsigned);
>
>  #ifdef RTX_CODE
>  extern void ix86_fp_comparison_codes (enum rtx_code code, enum rtx_code *,
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 876a19f4c1f..f86d11dfb11 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -837,7 +837,7 @@ x86_64_elf_unique_section (tree decl, int reloc)
>  void
>  x86_elf_aligned_decl_common (FILE *file, tree decl,
> const char *name, unsigned HOST_WIDE_INT size,
> -   int align)
> +   unsigned align)
>  {
>if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC)
>&& size > (unsigned int)ix86_section_threshold)
> @@ -858,7 +858,7 @@ x86_elf_aligned_decl_common (FILE *file, tree decl,
>
>  void
>  x86_output_aligned_bss (FILE *file, tree decl, const char *name,
> -   unsigned HOST_WIDE_INT size, int align)
> +   unsigned HOST_WIDE_INT size, unsigned align)
>  {
>if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC)
>&& size > (unsigned int)ix86_section_threshold)
> --
> 2.17.1
>

[pushed] c++: fold function template args sooner [PR101460]

2021-08-30 Thread Jason Merrill via Gcc-patches

As discussed in the PR, we were giving a lot of unnecessary errors for this
testcase because we didn't try to do constant evaluation until
convert_nontype_argument, which happens for each of the candidates.  But
when looking at a template-id as the function operand of a call, we can try
to fold arguments before we get into overload resolution.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/101460

gcc/cp/ChangeLog:

* cp-tree.h (cxx_constant_value_sfinae): Declare.
* constexpr.c (cxx_constant_value_sfinae): New.
* pt.c (fold_targs_r, maybe_fold_fn_template_args): New.
(tsubst_copy_and_build) [CALL_EXPR]: Call
maybe_fold_fn_template_args.

gcc/testsuite/ChangeLog:

* g++.dg/template/explicit-args6.C: New test.
---
 gcc/cp/cp-tree.h  |  1 +
 gcc/cp/constexpr.c| 12 
 gcc/cp/pt.c   | 60 +++
 .../g++.dg/template/explicit-args6.C  | 34 +++
 4 files changed, 107 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/template/explicit-args6.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 75ee88721b4..6a179375a56 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8266,6 +8266,7 @@ extern bool require_constant_expression (tree);
 extern bool require_rvalue_constant_expression (tree);
 extern bool require_potential_rvalue_constant_expression (tree);
 extern tree cxx_constant_value (tree, tree = NULL_TREE);
+extern tree cxx_constant_value_sfinae  (tree, tsubst_flags_t);
 extern void cxx_constant_dtor  (tree, tree);
 extern tree cxx_constant_init  (tree, tree = NULL_TREE);
 extern tree maybe_constant_value   (tree, tree = NULL_TREE, bool = 
false);
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index b9c006217be..9606719bc73 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -7458,6 +7458,18 @@ cxx_constant_value (tree t, tree decl)
   return cxx_eval_outermost_constant_expr (t, false, true, true, false, decl);
 }
 
+/* As above, but respect SFINAE.  */
+
+tree
+cxx_constant_value_sfinae (tree t, tsubst_flags_t complain)
+{
+  bool sfinae = !(complain & tf_error);
+  tree r = cxx_eval_outermost_constant_expr (t, sfinae, true, true);
+  if (sfinae && !TREE_CONSTANT (r))
+r = error_mark_node;
+  return r;
+}
+
 /* Like cxx_constant_value, but used for evaluation of constexpr destructors
of constexpr variables.  The actual initializer of DECL is not modified.  */
 
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 020a4bf2f6d..d7d0dce691c 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19465,6 +19465,63 @@ out:
   return r;
 }
 
+/* Subroutine of maybe_fold_fn_template_args.  */
+
+static bool
+fold_targs_r (tree targs, tsubst_flags_t complain)
+{
+  int len = TREE_VEC_LENGTH (targs);
+  for (int i = 0; i < len; ++i)
+{
+  tree &elt = TREE_VEC_ELT (targs, i);
+  if (!elt || TYPE_P (elt)
+ || TREE_CODE (elt) == TEMPLATE_DECL)
+   continue;
+  if (TREE_CODE (elt) == NONTYPE_ARGUMENT_PACK)
+   {
+ if (!fold_targs_r (ARGUMENT_PACK_ARGS (elt), complain))
+   return false;
+   }
+  else if (/* We can only safely preevaluate scalar prvalues.  */
+  SCALAR_TYPE_P (TREE_TYPE (elt))
+  && !glvalue_p (elt)
+  && !TREE_CONSTANT (elt))
+   {
+ elt = cxx_constant_value_sfinae (elt, complain);
+ if (elt == error_mark_node)
+   return false;
+   }
+}
+
+  return true;
+}
+
+/* Try to do constant evaluation of any explicit template arguments in FN
+   before overload resolution, to get any errors only once.  Return true iff
+   we didn't have any problems folding.  */
+
+static bool
+maybe_fold_fn_template_args (tree fn, tsubst_flags_t complain)
+{
+  if (processing_template_decl || fn == NULL_TREE)
+return true;
+  if (fn == error_mark_node)
+return false;
+  if (TREE_CODE (fn) == OFFSET_REF
+  || TREE_CODE (fn) == COMPONENT_REF)
+fn = TREE_OPERAND (fn, 1);
+  if (BASELINK_P (fn))
+fn = BASELINK_FUNCTIONS (fn);
+  if (TREE_CODE (fn) != TEMPLATE_ID_EXPR)
+return true;
+  tree targs = TREE_OPERAND (fn, 1);
+  if (targs == NULL_TREE)
+return true;
+  if (targs == error_mark_node)
+return false;
+  return fold_targs_r (targs, complain);
+}
+
 /* Like tsubst but deals with expressions and performs semantic
analysis.  FUNCTION_P is true if T is the "F" in "F (ARGS)" or
"F (ARGS)".  */
@@ -20343,6 +20400,9 @@ tsubst_copy_and_build (tree t,
&& !mark_used (function, complain) && !(complain & tf_error))
  RETURN (error_mark_node);
 
+   if (!maybe_fold_fn_template_args (function, complain))
+ return error_mark_node;
+
/* Put back tf_decltype for the actual call.  */
complain |= decltype_flag;
 
diff --git a/gcc/testsuite/g++.dg/template/explicit-args6

Re: [PATCH v3 1/6] rs6000: Support SSE4.1 "round" intrinsics

2021-08-30 Thread Bill Schmidt via Gcc-patches


Hi Paul,

On 8/30/21 4:16 PM, Paul A. Clarke wrote:

On Fri, Aug 27, 2021 at 08:44:43AM -0500, Bill Schmidt via Gcc-patches wrote:

On 8/23/21 2:03 PM, Paul A. Clarke wrote:

+   __fpscr_save.__fr = __builtin_mffsl ();

As pointed out in the v1 review, __builtin_mffsl is enabled (or supposed to
be) only for POWER9 and later.  This will fail to work on POWER8 and earlier
when the new builtins support is complete and this is enforced more
carefully.  Please #ifdef and use __builtin_mffs on earlier processors.
Please do this everywhere this occurs.

I think you got some contradictory guidance on this, but trust me, this will
break.

The confusing thing is that _builtin_mffsl is explicitly supported on earlier
processors, if I read the code right (from gcc/config/rs6000/rs6000.md):
--
(define_expand "rs6000_mffsl"
   [(set (match_operand:DF 0 "gpc_reg_operand")
 (unspec_volatile:DF [(const_int 0)] UNSPECV_MFFSL))]
   "TARGET_HARD_FLOAT"
{
   /* If the low latency mffsl instruction (ISA 3.0) is available use it,
  otherwise fall back to the older mffs instruction to emulate the mffsl
  instruction.  */
   
   if (!TARGET_P9_MISC)

 {
   rtx tmp1 = gen_reg_rtx (DFmode);

   /* The mffs instruction reads the entire FPSCR.  Emulate the mffsl
  instruction using the mffs instruction and masking the result.  */
   emit_insn (gen_rs6000_mffs (tmp1));
...
--

Is that going away?  If so, that would be a possible (undesirable?)
API change, no?


Hm, I see.  I missed that in the builtins conversion.  Apparently 
there's nothing in the test suite that verifies this work on P9, which 
is a hole that could use fixing.  This usage isn't documented anywhere 
near the builtin machinery, either.


I'll patch the new builtins code to move this to a more permissive 
stanza and document why.  You can leave your code as is.


Thanks!
Bill



PC

[pushed] c++: preserve location through constexpr

2021-08-30 Thread Jason Merrill via Gcc-patches

While working on the patch for PR101460, I noticed that we were losing the
expression location when folding class prvalue expressions.  The final patch
doesn't fold class prvalues, but this still seems a worthwhile change.  I
don't add location wrappers for scalar prvalues because many callers are
trying to fold them away.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_outermost_constant_expr): Copy
expr location to result.
---
 gcc/cp/constexpr.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 9606719bc73..e78fdf021b2 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -7445,6 +7445,11 @@ cxx_eval_outermost_constant_expr (tree t, bool 
allow_non_constant,
}
 }
 
+  /* Remember the original location if that wouldn't need a wrapper.  */
+  if (location_t loc = EXPR_LOCATION (t))
+if (CAN_HAVE_LOCATION_P (r))
+  SET_EXPR_LOCATION (r, loc);
+
   return r;
 }
 

base-commit: a8de832470f78a40a0e2c8de866a471bf74bf0ab
-- 
2.27.0

[pushed] c++: limit instantiation with ill-formed class [PR96286]

2021-08-30 Thread Jason Merrill via Gcc-patches

I noticed that after the static_assert failures in lwg3466.cc, we got
various follow-on errors because we went ahead and tried to instantiate the
promise member functions even after instantiating the class itself ran
into problems.  Interrupting instantiation of the class itself seems likely
to cause error-recovery problems, but preventing instantiation of member
functions seems strictly better for error-recovery.

This doesn't fix any of the specific testcases in PR96286, but addresses
part of that problem space.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/96286

gcc/cp/ChangeLog:

* cp-tree.h (struct lang_type): Add erroneous bit-field.
(CLASSTYPE_ERRONEOUS): New.
* pt.c (limit_bad_template_recursion): Check it.
(instantiate_class_template_1): Set it.

libstdc++-v3/ChangeLog:

* testsuite/30_threads/promise/requirements/lwg3466.cc:
Remove dg-prune-outputs.

gcc/testsuite/ChangeLog:

* g++.dg/template/access2.C: Split struct A.
---
 gcc/cp/cp-tree.h   |  7 ++-
 gcc/cp/pt.c| 14 +-
 gcc/testsuite/g++.dg/template/access2.C|  6 +-
 .../30_threads/promise/requirements/lwg3466.cc |  4 
 4 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 6a179375a56..ce7ca53a113 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -2336,6 +2336,7 @@ struct GTY(()) lang_type {
   unsigned has_constexpr_ctor : 1;
   unsigned unique_obj_representations : 1;
   unsigned unique_obj_representations_set : 1;
+  bool erroneous : 1;
 
   /* When adding a flag here, consider whether or not it ought to
  apply to a template instance if it applies to the template.  If
@@ -2344,7 +2345,7 @@ struct GTY(()) lang_type {
   /* There are some bits left to fill out a 32-bit word.  Keep track
  of this by updating the size of this bitfield whenever you add or
  remove a flag.  */
-  unsigned dummy : 5;
+  unsigned dummy : 4;
 
   tree primary_base;
   vec *vcall_indices;
@@ -2660,6 +2661,10 @@ struct GTY(()) lang_type {
 /* Nonzero if a _DECL node requires us to output debug info for this class.  */
 #define CLASSTYPE_DEBUG_REQUESTED(NODE) \
   (LANG_TYPE_CLASS_CHECK (NODE)->debug_requested)
+
+/* True if we saw errors while instantiating this class.  */
+#define CLASSTYPE_ERRONEOUS(NODE) \
+  (LANG_TYPE_CLASS_CHECK (NODE)->erroneous)
 
 /* Additional macros for inheritance information.  */
 
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index d7d0dce691c..fcf3ac31b25 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10885,9 +10885,14 @@ limit_bad_template_recursion (tree decl)
 {
   struct tinst_level *lev = current_tinst_level;
   int errs = errorcount + sorrycount;
-  if (lev == NULL || errs == 0 || !neglectable_inst_p (decl))
+  if (errs == 0 || !neglectable_inst_p (decl))
 return false;
 
+  /* Avoid instantiating members of an ill-formed class.  */
+  if (DECL_CLASS_SCOPE_P (decl)
+  && CLASSTYPE_ERRONEOUS (DECL_CONTEXT (decl)))
+return true;
+
   for (; lev; lev = lev->next)
 if (neglectable_inst_p (lev->maybe_get_node ()))
   break;
@@ -12212,6 +12217,13 @@ instantiate_class_template_1 (tree type)
   finish_struct_1 (type);
   TYPE_BEING_DEFINED (type) = 0;
 
+  /* Remember if instantiating this class ran into errors, so we can avoid
+ instantiating member functions in limit_bad_template_recursion.  We set
+ this flag even if the problem was in another instantiation triggered by
+ this one, as that will likely also cause trouble for member functions.  */
+  if (errorcount + sorrycount > current_tinst_level->errors)
+CLASSTYPE_ERRONEOUS (type) = true;
+
   /* We don't instantiate default arguments for member functions.  14.7.1:
 
  The implicit instantiation of a class template specialization causes
diff --git a/gcc/testsuite/g++.dg/template/access2.C 
b/gcc/testsuite/g++.dg/template/access2.C
index 0620c10f79d..4a80bb4d280 100644
--- a/gcc/testsuite/g++.dg/template/access2.C
+++ b/gcc/testsuite/g++.dg/template/access2.C
@@ -5,6 +5,9 @@
 
 template  struct A {
   typename T::X x; // { dg-error "this context" }
+};
+
+template  struct A2 {
   int f() { return T::i; } // { dg-error "this context" }
 };
 
@@ -16,5 +19,6 @@ class B {
 int main()
 {
   A ab; // { dg-message "required" }
-  ab.f();  // { dg-message "required" }
+  A2 a2b;
+  a2b.f(); // { dg-message "required" }
 }
diff --git a/libstdc++-v3/testsuite/30_threads/promise/requirements/lwg3466.cc 
b/libstdc++-v3/testsuite/30_threads/promise/requirements/lwg3466.cc
index bb04bf0737c..acef47f12f3 100644
--- a/libstdc++-v3/testsuite/30_threads/promise/requirements/lwg3466.cc
+++ b/libstdc++-v3/testsuite/30_threads/promise/requirements/lwg3466.cc
@@ -27,17 +27,13 @@ std::promise good2;
 
 st

[PATCH] PR fortran/101327 - ICE in find_array_element, at fortran/expr.c:1355

2021-08-30 Thread Harald Anlauf via Gcc-patches

There was an issue when trying to use an element from an array constructor
which was a broken in a way probably only Gerhard could conceive.
We hit an assert that can be replaced by more robust code.

Patch is basically Steve's.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald


Fortran - improve error recovery determining array element from constructor

gcc/fortran/ChangeLog:

PR fortran/101327
* expr.c (find_array_element): When bounds cannot be determined as
constant, return error instead of aborting.

gcc/testsuite/ChangeLog:

PR fortran/101327
* gfortran.dg/pr101327.f90: New test.

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index 35563a78697..dfecc3012e1 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -1337,7 +1337,9 @@ find_array_element (gfc_constructor_base base, gfc_array_ref *ar,
   for (i = 0; i < ar->dimen; i++)
 {
   if (!gfc_reduce_init_expr (ar->as->lower[i])
-	  || !gfc_reduce_init_expr (ar->as->upper[i]))
+	  || !gfc_reduce_init_expr (ar->as->upper[i])
+	  || ar->as->upper[i]->expr_type != EXPR_CONSTANT
+	  || ar->as->lower[i]->expr_type != EXPR_CONSTANT)
 	{
 	  t = false;
 	  cons = NULL;
@@ -1351,9 +1353,6 @@ find_array_element (gfc_constructor_base base, gfc_array_ref *ar,
 	  goto depart;
 	}

-  gcc_assert (ar->as->upper[i]->expr_type == EXPR_CONSTANT
-		  && ar->as->lower[i]->expr_type == EXPR_CONSTANT);
-
   /* Check the bounds.  */
   if ((ar->as->upper[i]
 	   && mpz_cmp (e->value.integer,
diff --git a/gcc/testsuite/gfortran.dg/pr101327.f90 b/gcc/testsuite/gfortran.dg/pr101327.f90
new file mode 100644
index 000..f4377aabe7a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr101327.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! PR fortran/101327 - ICE in find_array_element, at fortran/expr.c:1355
+
+subroutine s
+  integer, parameter :: n([2]) = [1, 2] ! { dg-error "must be scalar" }
+  type t
+ integer :: a(n(1):n(2))
+  end type
+end
+
+! { dg-error "cannot be automatic or of deferred shape" " " { target *-*-* } 5 }

[committed] analyzer: support "bifurcation"; reimplement realloc [PR99260]

2021-08-30 Thread David Malcolm via Gcc-patches

Most of the state-management code in the analyzer involves
modifying state objects in-place, which implies a single outcome.
(I originally implemented in-place modification because I wanted
to avoid having to create copies of state objects, and it's now
very difficult to change this aspect of the analyzer's design)

However, there are various special-cases such as "realloc" for which
it's best to split the state into multiple outcomes.

This patch adds a mechanism for "bifurcating" the analysis in places
where there isn't a split in the CFG, and uses it to implement realloc,
in this case treating it as having 3 possible outcomes:
- failure, returning NULL
- success, growing the buffer in-place without moving it
- success, allocating a new buffer, copying the content of the old
  buffer to it, and freeing the old buffer.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-3237-geafa9d969237fd8f712c4b25a8c58932c01f44b4.

gcc/ChangeLog:
PR analyzer/99260
* Makefile.in (ANALYZER_OBJS): Add analyzer/call-info.o.

gcc/analyzer/ChangeLog:
PR analyzer/99260
* analyzer.h (class custom_edge_info): New class, adapted from
exploded_edge::custom_info_t.  Make member functions const.
Make update_model return bool, converting edge param from
reference to a pointer, and adding a ctxt param.
(class path_context): New class.
* call-info.cc: New file.
* call-info.h: New file.
* engine.cc: Include "analyzer/call-info.h" and .
(impl_region_model_context::impl_region_model_context): Update for
new m_path_ctxt field.
(impl_region_model_context::bifurcate): New.
(impl_region_model_context::terminate_path): New.
(impl_region_model_context::get_malloc_map): New.
(impl_sm_context::impl_sm_context): Update for new m_path_ctxt
field.
(impl_sm_context::get_fndecl_for_call): Likewise.
(impl_sm_context::set_next_state): Likewise.
(impl_sm_context::warn): Likewise.
(impl_sm_context::is_zero_assignment): Likewise.
(impl_sm_context::get_path_context): New.
(impl_sm_context::m_path_ctxt): New.
(impl_region_model_context::on_condition): Update for new
path_ctxt param.  Handle m_enode_for_diag being NULL.
(impl_region_model_context::on_phi): Update for new path_ctxt
param.
(exploded_node::on_stmt): Add path_ctxt param, updating ctor calls
to use it as necessary.  Use it to bail out after sm-handling,
if needed.
(exploded_node::detect_leaks): Update for new path_ctxt param.
(dynamic_call_info_t::update_model): Update for conversion of
exploded_edge::custom_info_t to custom_edge_info.
(dynamic_call_info_t::add_events_to_path): Likewise.
(rewind_info_t::update_model): Likewise.
(rewind_info_t::add_events_to_path): Likewise.
(exploded_edge::exploded_edge): Likewise.
(exploded_graph::add_edge): Likewise.
(exploded_graph::maybe_process_run_of_before_supernode_enodes):
Update for new path_ctxt param.
(class impl_path_context): New.
(exploded_graph::process_node): Update for new path_ctxt param.
Create an impl_path_context and pass it to exploded_node::on_stmt.
Use it to terminate iterating stmts if terminate_path is called
on it.  After processing a run of stmts, query path_ctxt to
potentially terminate the analysis path, and/or to "bifurcate" the
analysis into multiple additional paths.
(feasibility_state::maybe_update_for_edge): Update for new
update_model ctxt param.
* exploded-graph.h
(impl_region_model_context::impl_region_model_context): Add
path_ctxt param.
(impl_region_model_context::bifurcate): New.
(impl_region_model_context::terminate_path): New
(impl_region_model_context::get_ext_state): New.
(impl_region_model_context::get_malloc_map): New.
(impl_region_model_context::m_path_ctxt): New field.
(exploded_node::on_stmt): Add path_ctxt param.
(class exploded_edge::custom_info_t): Move to analyzer.h, renaming
to custom_edge_info, and making the changes as noted in analyzer.h
above.
(exploded_edge::exploded_edge): Update for these changes to
exploded_edge::custom_info_t.
(exploded_edge::m_custom_info): Likewise.
(class dynamic_call_info_t): Likewise.
(class rewind_info_t): Likewise.
(exploded_graph::add_edge): Likewise.
* program-state.cc (program_state::on_edge): Update for new
path_ctxt param.
(program_state::push_call): Likewise.
(program_state::returning_call): Likewise.
(program_state::prune_for_point): Likewise.
* region-model-impl-calls.cc: Include "analyzer/call-info.h".
(call_details::get_fndecl_for_call): New.

Re: [pushed] c++: preserve location through constexpr

2021-08-30 Thread Marek Polacek via Gcc-patches

On Mon, Aug 30, 2021 at 05:25:01PM -0400, Jason Merrill via Gcc-patches wrote:
> While working on the patch for PR101460, I noticed that we were losing the
> expression location when folding class prvalue expressions.  The final patch
> doesn't fold class prvalues, but this still seems a worthwhile change.  I
> don't add location wrappers for scalar prvalues because many callers are
> trying to fold them away.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.
> 
> gcc/cp/ChangeLog:
> 
>   * constexpr.c (cxx_eval_outermost_constant_expr): Copy
>   expr location to result.
> ---
>  gcc/cp/constexpr.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
> index 9606719bc73..e78fdf021b2 100644
> --- a/gcc/cp/constexpr.c
> +++ b/gcc/cp/constexpr.c
> @@ -7445,6 +7445,11 @@ cxx_eval_outermost_constant_expr (tree t, bool 
> allow_non_constant,
>   }
>  }
>  
> +  /* Remember the original location if that wouldn't need a wrapper.  */
> +  if (location_t loc = EXPR_LOCATION (t))
> +if (CAN_HAVE_LOCATION_P (r))
> +  SET_EXPR_LOCATION (r, loc);

Can we use protected_set_expr_location or did you not want that?

Marek

Re: [PATCH][pushed] Add -fprofile-reproducible=parallel-runs to STAGEfeedback_CFLAGS to Makefile.tpl.

2021-08-30 Thread Gleb Fotengauer-Malinovskiy

On Mon, Aug 30, 2021 at 10:13:21AM +0200, Martin Liška wrote:
> On 8/27/21 19:45, Gleb Fotengauer-Malinovskiy wrote:
> > Hi,
> > 
> > On Thu, Mar 11, 2021 at 04:19:51PM +0100, Martin Liška wrote:
> >> Pushed as obvious, the original change was done
> >> in g:e05a117dc4b98f3ac60851608f532ba7cee7343a.
> >>
> >> Martin
> >>
> >> ChangeLog:
> >>
> >>* Makefile.tpl: The change was done Makefile.in which
> >>is generated file.
> >> ---
> >>Makefile.tpl | 2 +-
> >>1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/Makefile.tpl b/Makefile.tpl
> >> index 3b88f351d5b..6e0337fb48f 100644
> >> --- a/Makefile.tpl
> >> +++ b/Makefile.tpl
> >> @@ -488,7 +488,7 @@ STAGEprofile_TFLAGS = $(STAGE2_TFLAGS)
> >>STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS))
> >>STAGEtrain_TFLAGS = $(filter-out -fchecking=1,$(STAGE3_TFLAGS))
> >>
> >> -STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use
> >> +STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use 
> >> -fprofile-reproducible=parallel-runs
> > 
> > Did you mean to add -fprofile-reproducible flag to STAGEprofile_CFLAGS
> > instead?
> 
> No.
> 
> > I suppose this flag doesn't mean anything without
> > -fprofile-generate, which is passed through STAGEprofile_CFLAGS.
> 
> The flag -fprofile-reproducible is used when we use a collected profile, so 
> it works
> *only* with -fprofile-use flag.

Thank you for the answer, now I see how it works.

I think that it means that the documentation is rather misleading about
this flag:

 '-fprofile-reproducible=[multithreaded|parallel-runs|serial]'
  Control level of reproducibility of profile gathered by
  '-fprofile-generate'. [...]

It convinced me that this flag changes the way the code is instrumented
for profile generation.  The rest of it convinced me even more, actually.

> Thanks for heads up!
> Cheers,
> Martin
> 
> > 
> > Thanks,
> > 

-- 
glebfm

[PATCH] jit : Generate debug info for variables

2021-08-30 Thread Petter Tomner via Gcc-patches

Hi,

This is a patch to generate debug info for local variables as well as globals. 
With this, "ptype foo", "info variables", "info locals" etc works when 
debugging in GDB.

Finalizing of global variable declares are moved to after locations are handled 
and done
as Fortran, C, Go etc do it. Also, primitive types have their TYPE_NAME set for 
debug info
on types to work.

Below are the patch, and I attached a testcase. Since it requires GDB to run it 
might
not be suitable? Make check-jit runs fine on Debian x64.

Regards,


>From d77e77104024c7ae9ce31b419dad1f0a5801fda7 Mon Sep 17 00:00:00 2001
From: Petter Tomner 
Date: Mon, 30 Aug 2021 01:44:07 +0200
Subject: [PATCH 1/2] libgccjit: Generate debug info for variables

Finalize declares via available helpers after location is set. Set
TYPE_NAME of primitives and friends to "int" etc. Debug info is now
set properly for variables.

2021-08-31  Petter Tomner   

gcc/jit
jit-playback.c
jit-playback.h
gcc/testsuite/jit.dg/
test-error-array-bounds.c: Array is not unsigned
---
 gcc/jit/jit-playback.c| 76 ++-
 gcc/jit/jit-playback.h|  5 ++
 .../jit.dg/test-error-array-bounds.c  |  2 +-
 3 files changed, 65 insertions(+), 18 deletions(-)

diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c
index 79ac525e5df..74321b0946a 100644
--- a/gcc/jit/jit-playback.c
+++ b/gcc/jit/jit-playback.c
@@ -165,7 +165,8 @@ gt_ggc_mx ()
 
 /* Given an enum gcc_jit_types value, get a "tree" type.  */
 
-static tree
+tree
+playback::context::
 get_tree_node_for_type (enum gcc_jit_types type_)
 {
   switch (type_)
@@ -192,11 +193,7 @@ get_tree_node_for_type (enum gcc_jit_types type_)
   return short_unsigned_type_node;
 
 case GCC_JIT_TYPE_CONST_CHAR_PTR:
-  {
-   tree const_char = build_qualified_type (char_type_node,
-   TYPE_QUAL_CONST);
-   return build_pointer_type (const_char);
-  }
+  return m_const_char_ptr;
 
 case GCC_JIT_TYPE_INT:
   return integer_type_node;
@@ -579,10 +576,6 @@ playback::lvalue *
 playback::context::
 global_finalize_lvalue (tree inner)
 {
-  varpool_node::get_create (inner);
-
-  varpool_node::finalize_decl (inner);
-
   m_globals.safe_push (inner);
 
   return new lvalue (this, inner);
@@ -2952,9 +2945,7 @@ replay ()
 {
   JIT_LOG_SCOPE (get_logger ());
 
-  m_const_char_ptr
-= build_pointer_type (build_qualified_type (char_type_node,
-   TYPE_QUAL_CONST));
+  init_types ();
 
   /* Replay the recorded events:  */
   timevar_push (TV_JIT_REPLAY);
@@ -2984,15 +2975,22 @@ replay ()
 {
   int i;
   function *func;
-
+  tree global;
   /* No GC can happen yet; process the cached source locations.  */
   handle_locations ();
 
+  /* Finalize globals. See how FORTRAN 95 does it in gfc_be_parse_file()
+ for a simple reference. */
+  FOR_EACH_VEC_ELT (m_globals, i, global)
+rest_of_decl_compilation (global, true, true);
+
+  wrapup_global_declarations (m_globals.address(), m_globals.length());
+
   /* We've now created tree nodes for the stmts in the various blocks
-in each function, but we haven't built each function's single stmt
-list yet.  Do so now.  */
+  in each function, but we haven't built each function's single 
stmt
+  list yet.  Do so now.  */
   FOR_EACH_VEC_ELT (m_functions, i, func)
-   func->build_stmt_list ();
+ func->build_stmt_list ();
 
   /* No GC can have happened yet.  */
 
@@ -3081,6 +3079,50 @@ location_comparator (const void *lhs, const void *rhs)
   return loc_lhs->get_column_num () - loc_rhs->get_column_num ();
 }
 
+/* Initialize the NAME_TYPE of the primitive types as well as some
+   others. */
+void
+playback::context::
+init_types ()
+{
+  /* See lto_init() in lto-lang.c or void visit (TypeBasic *t) in D's types.cc 
+ for reference. If TYPE_NAME is not set, debug info will not contain types 
*/
+#define NAME_TYPE(t,n) \
+if (t) \
+  TYPE_NAME (t) = build_decl (UNKNOWN_LOCATION, TYPE_DECL, \
+  get_identifier (n), t)
+
+  NAME_TYPE (integer_type_node, "int");
+  NAME_TYPE (char_type_node, "char");
+  NAME_TYPE (long_integer_type_node, "long int");
+  NAME_TYPE (unsigned_type_node, "unsigned int");
+  NAME_TYPE (long_unsigned_type_node, "long unsigned int");
+  NAME_TYPE (long_long_integer_type_node, "long long int");
+  NAME_TYPE (long_long_unsigned_type_node, "long long unsigned int");
+  NAME_TYPE (short_integer_type_node, "short int");
+  NAME_TYPE (short_unsigned_type_node, "short unsigned int");
+  if (signed_char_type_node != char_type_node)
+NAME_TYPE (signed_char_type_node, "signed char");
+  if (unsigned_char_type_node != char_type_node)
+NAME_TYPE (unsigned_char_type_node, "unsigned char");
+  NAME_TYPE (float_type_node, "float");
+  NAME_

Sv: [PATCH] jit : Generate debug info for variables

2021-08-30 Thread Petter Tomner via Gcc-patches

Well I seemed to have attached the wrong testcase. Here is the proper one 
attached.

Regards,

-Ursprungligt meddelande-
Från: Petter Tomner 
Skickat: den 31 augusti 2021 02:14
Till: gcc-patches@gcc.gnu.org; j...@gcc.gnu.org
Ämne: [PATCH] jit : Generate debug info for variables

Hi,

This is a patch to generate debug info for local variables as well as globals. 
With this, "ptype foo", "info variables", "info locals" etc works when 
debugging in GDB.

Finalizing of global variable declares are moved to after locations are handled 
and done
as Fortran, C, Go etc do it. Also, primitive types have their TYPE_NAME set for 
debug info
on types to work.

Below are the patch, and I attached a testcase. Since it requires GDB to run it 
might
not be suitable? Make check-jit runs fine on Debian x64.

Regards,


0002-libgccjit-Test-cases-for-debug-info.patch
Description: 0002-libgccjit-Test-cases-for-debug-info.patch

[PATCH] c++: check arity before deduction w/ explicit targs [PR12672]

2021-08-30 Thread Patrick Palka via Gcc-patches

During overload resolution, when the arity of a function template
clearly disagrees with the arity of the call, no specialization of the
function template could yield a viable candidate.  The deduction routine
type_unification_real already notices this situation, but not before
it substitutes explicit template arguments into the template, a step
which could induce a hard error.  Although it's necessary to perform
this substitution first in order to check arity perfectly (since the
substitution can e.g. expand a non-trailing parameter pack), in most
cases we can determine ahead of time whether there's an arity
disagreement without needing to perform deduction.

To that end, this patch implements an (approximate) arity check in
add_template_candidate_real that guards actual deduction.  It's enabled
only when there are explicit template arguments since that's when
deduction can force otherwise avoidable template instantiations.  (I
experimented with enabling it unconditionally as an optimization and
saw some compile-time improvements of about 5% but also some regressions
by about the same magnitude, so kept it conditional.)

In passing, this adds a least_p parameter to arity_rejection for sake
of consistent diagnostics with unify_arity.

A couple of testcases needed to be adjusted so that deduction continues
to occur as intended after this change.  Except in unify6.C, where we
were expecting foo to be ill-formed due to substitution
yielding a function type with an added 'const', but I think this is
permitted by [dcl.fct]/7, so I changed the test accordingly.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/12672

gcc/cp/ChangeLog:

* call.c (rejection_reason::call_varargs_p): Rename this
previously unused member to ...
(rejection_reason::least_p): ... this.
(arity_rejection): Add least_p parameter.
(add_template_candidate_real): When there are explicit
template arguments, check that the arity of the call agrees with
the arity of the function before attempting deduction.
(print_arity_information): Add least_p parameter.
(print_z_candidate): Adjust call to print_arity_information.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype29.C: Adjust.
* g++.dg/template/error56.C: Adjust.
* g++.old-deja/g++.pt/unify6.C: Adjust.
* g++.dg/template/explicit-args6.C: New test.
---
 gcc/cp/call.c | 67 ---
 gcc/testsuite/g++.dg/cpp0x/decltype29.C   |  4 +-
 gcc/testsuite/g++.dg/template/error56.C   |  4 +-
 .../g++.dg/template/explicit-args6.C  | 33 +
 gcc/testsuite/g++.old-deja/g++.pt/unify6.C|  4 +-
 5 files changed, 96 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/explicit-args6.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index e4df72ec1a3..80e6121ce44 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -455,8 +455,8 @@ struct rejection_reason {
   int expected;
   /* The actual number of arguments in the call.  */
   int actual;
-  /* Whether the call was a varargs call.  */
-  bool call_varargs_p;
+  /* Whether EXPECTED should be treated as a lower bound.  */
+  bool least_p;
 } arity;
 /* Information about an argument conversion mismatch.  */
 struct conversion_info conversion;
@@ -628,12 +628,13 @@ alloc_rejection (enum rejection_reason_code code)
 }
 
 static struct rejection_reason *
-arity_rejection (tree first_arg, int expected, int actual)
+arity_rejection (tree first_arg, int expected, int actual, bool least_p = 
false)
 {
   struct rejection_reason *r = alloc_rejection (rr_arity);
   int adjust = first_arg != NULL_TREE;
   r->u.arity.expected = expected - adjust;
   r->u.arity.actual = actual - adjust;
+  r->u.arity.least_p = least_p;
   return r;
 }
 
@@ -3452,6 +3453,44 @@ add_template_candidate_real (struct z_candidate 
**candidates, tree tmpl,
 }
   gcc_assert (ia == nargs_without_in_chrg);
 
+  if (!obj && explicit_targs)
+{
+  /* Check that there's no obvious arity mismatch before proceeding with
+deduction.  This avoids substituting explicit template arguments
+into the template (which could result in an error outside the
+immediate context) when the resulting candidate would be unviable
+anyway.  */
+  int min_arity = 0, max_arity = 0;
+  tree parms = TYPE_ARG_TYPES (TREE_TYPE (tmpl));
+  parms = skip_artificial_parms_for (tmpl, parms);
+  for (; parms != void_list_node; parms = TREE_CHAIN (parms))
+   {
+ if (!parms || PACK_EXPANSION_P (TREE_VALUE (parms)))
+   {
+ max_arity = -1;
+ break;
+   }
+ if (TREE_PURPOSE (parms))
+   /* A parameter with a default argument.  */
+   ++max_arity;
+ else
+   ++min_arity, ++max_arity;
+   }
+  if (ia < (unsigne

[PATCH] c++: shortcut bad convs during overload resolution [PR101904]

2021-08-30 Thread Patrick Palka via Gcc-patches

In the context of overload resolution we have the notion of a "bad"
argument conversion, which is a conversion that "would be a permitted
with a bending of the language standards", and we handle such bad
conversions specially.  In particular, we rank a bad conversion as
better than no conversion but worse than a good conversion, and a bad
conversion doesn't necessarily make a candidate unviable.  With the
flag -fpermissive, we permit the situation where overload resolution
selects a candidate that contains a bad conversion (which we call a
non-strictly viable candidate).  And without the flag we issue a
distinct permerror in this situation instead.

One consequence of this defacto behavior is that in order to distinguish
a non-strictly viable candidate from an unviable candidate, if we
encounter a bad argument conversion during overload resolution we must
keep converting subsequent arguments because a subsequent conversion
could render the candidate unviable instead of just non-strictly viable.
But checking subsequent arguments can force template instantiations and
result in otherwise avoidable hard errors.  And in particular, all
'this' conversions are at worst bad, so this means the const/ref-qualifiers
of a member function can't be used to prune a candidate quickly, which
is the subject of the mentioned PR.

This patch tries to improve the situation without changing the defacto
output of add_candidates.  Specifically, when considering a candidate
during overload resolution this patch makes us shortcut argument
conversion checking upon encountering the first bad conversion
(tentatively marking the candidate as non-strictly viable, though it
could ultimately be unviable) under the assumption that we'll eventually
find a strictly viable candidate anyway (rendering the distinction
between non-strictly viable and unviable moot, since both are worse
than a strictly viable candidate).  If this assumption turns out to be
false, we'll fully reconsider the candidate under the defacto behavior
(without the shortcutting).

So in the best case (there's a strictly viable candidate), we avoid
some argument conversions and/or template argument deduction that may
cause a hard error.  In the worst case (there's no such candidate), we
have to redundantly consider some candidates twice.  (In a previous
version of the patch, to avoid this redundant checking I created a new
"deferred" conversion type that represents a conversion that is yet to
be performed, and instead of reconsidering a candidate I just realized
its deferred conversions.  But it doesn't seem this redundancy is a
significant performance issue to justify the added complexity of this
other approach.)

Lots of care was taken to preserve the defacto behavior w.r.t.
non-strictly viable candidates, but I wonder how important this behavior
is nowadays?  Can the notion of a non-strictly viable candidate be done
away with, or is it here to stay?

Bootstrapped and regtested on x86_64-pc-linux-gnu.

PR c++/101904

gcc/cp/ChangeLog:

* call.c (build_this_conversion): New function, split out from
add_function_candidate.
(add_function_candidate): New parameter shortcut_bad_convs.
Document it.  Use build_this_conversion.  Stop at the first bad
argument conversion when shortcut_bad_convs is true.
(add_template_candidate_real): New parameter shortcut_bad_convs.
Use build_this_conversion to check the 'this' conversion before
attempting deduction.  When the rejection reason code is
rr_bad_arg_conversion, pass -1 instead of 0 as the viable
parameter to add_candidate.  Pass 'convs' to add_candidate.
(add_template_candidate): New parameter shortcut_bad_convs.
(add_template_conv_candidate): Pass false as shortcut_bad_convs
to add_template_candidate_real.
(add_candidates): Prefer to shortcut bad conversions during
overload resolution under the assumption that we'll eventually
see a strictly viable candidate.  If this assumption turns out
to be false, re-process the non-strictly viable candidates
without shortcutting those bad conversions.

gcc/testsuite/ChangeLog:

* g++.dg/template/conv17.C: New test.
---
 gcc/cp/call.c  | 250 +
 gcc/testsuite/g++.dg/template/conv17.C |  56 ++
 2 files changed, 233 insertions(+), 73 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/conv17.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index e4df72ec1a3..2a0bf933a89 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -175,17 +175,17 @@ static struct z_candidate *splice_viable (struct 
z_candidate *, bool, bool *);
 static bool any_strictly_viable (struct z_candidate *);
 static struct z_candidate *add_template_candidate
(struct z_candidate **, tree, tree, tree, tree, const vec 
*,
-tree, tree, tree, int, unification_kind_t, tsubst_flags_t);
+tree, tree, tre

[PATCH] c++: parameter pack inside constexpr if [PR101764]

2021-08-30 Thread Patrick Palka via Gcc-patches

Here when partially substituting into the pack expansion, substitution
into the constexpr if yields a still-dependent tree, so tsubst_expr
returns an IF_STMT with an unsubstituted IF_COND and with
IF_STMT_EXTRA_ARGS added to.  Hence after partial substitution
the pack expansion pattern still refers to the parameter pack 'ts' of
level 2 (and it's thus represented in the new PACK_EXPANSION_PARAMETER_PACKS)
even though the partially instantiated generic lambda admits only one
level of template arguments.  This causes us to crash during the
subsequent instantiation with the lambda's template arguments because of
the level mismatch.  (Likewise when the constexpr if is replaced by a
requires-expr, which too uses the extra args mechanism for delaying
partial instantiation.)

So essentially, a pack expansion pattern that contains a parameter pack
inside an "extra args" tree doesn't play well with partial substitution
thereof.  This patch fixes this by forcing such pack expansions to use
the extra args mechanism as well.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/101764

gcc/cp/ChangeLog:

* cp-tree.h (PACK_EXPANSION_FORCE_EXTRA_ARGS_P): New accessor
macro.
* pt.c (uses_extra_args_mechanism_p): New function.
(find_parameter_pack_data::found_pack_within_extra_args_tree_p):
New data member.
(find_parameter_pack_data::inside_extra_args_tree_p): Likewise.
(find_parameter_packs_r): Detect parameter packs within "extra
args" trees and set found_pack_within_extra_args_tree_p
appropriately.
(make_pack_expansion): Set PACK_EXPANSION_FORCE_EXTRA_ARGS_P if
found_pack_within_extra_args_tree_p.
(use_pack_expansion_extra_args_p): Return true if there were
unsubstituted packs and PACK_EXPANSION_FORCE_EXTRA_ARGS_P.
(tsubst_pack_expansion): Pass the pack expansion to
use_pack_expansion_extra_args_p.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if35.C: New test.
---
 gcc/cp/cp-tree.h|  5 ++
 gcc/cp/pt.c | 69 -
 gcc/testsuite/g++.dg/cpp1z/constexpr-if35.C | 18 ++
 3 files changed, 90 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if35.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index ce7ca53a113..06dec495428 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -493,6 +493,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
   CONSTRUCTOR_C99_COMPOUND_LITERAL (in CONSTRUCTOR)
   OVL_NESTED_P (in OVERLOAD)
   DECL_MODULE_EXPORT_P (in _DECL)
+  PACK_EXPANSION_FORCE_EXTRA_ARGS_P (in *_PACK_EXPANSION)
4: IDENTIFIER_MARKED (IDENTIFIER_NODEs)
   TREE_HAS_CONSTRUCTOR (in INDIRECT_REF, SAVE_EXPR, CONSTRUCTOR,
  CALL_EXPR, or FIELD_DECL).
@@ -3902,6 +3903,10 @@ struct GTY(()) lang_decl {
 /* True iff this pack expansion is for auto... in lambda init-capture.  */
 #define PACK_EXPANSION_AUTO_P(NODE) TREE_LANG_FLAG_2 (NODE)
 
+/* True if we must use PACK_EXPANSION_EXTRA_ARGS to avoid partial
+   substitution into this pack expansion.  */
+#define PACK_EXPANSION_FORCE_EXTRA_ARGS_P(NODE) TREE_LANG_FLAG_3 (NODE)
+
 /* True iff the wildcard can match a template parameter pack.  */
 #define WILDCARD_PACK_P(NODE) TREE_LANG_FLAG_0 (NODE)
 
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index fcf3ac31b25..a92dff88d9d 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -3855,6 +3855,20 @@ expand_builtin_pack_call (tree call, tree args, 
tsubst_flags_t complain,
   return NULL_TREE;
 }
 
+/* Return true if the tree T uses the extra args mechanism for
+   deferring partial substitution into it.  */
+
+static bool
+uses_extra_args_mechanism_p (tree t)
+{
+  return (PACK_EXPANSION_P (t)
+ || TREE_CODE (t) == REQUIRES_EXPR
+ || (TREE_CODE (t) == IF_STMT
+ && IF_STMT_CONSTEXPR_P (t)));
+}
+
+static tree find_parameter_packs_r (tree *, int *, void*);
+
 /* Structure used to track the progress of find_parameter_packs_r.  */
 struct find_parameter_pack_data
 {
@@ -3867,6 +3881,16 @@ struct find_parameter_pack_data
 
   /* True iff we're making a type pack expansion.  */
   bool type_pack_expansion_p;
+
+  /* True iff we found a pack inside a subtree that uses the extra
+ args mechanism.  */
+  bool found_pack_within_extra_args_tree_p = false;
+
+private:
+  /* True iff find_parameter_packs_r is currently visiting a tree
+ that uses the extra args mechanism or a subtree thereof.  */
+  bool inside_extra_args_tree_p = false;
+  friend tree find_parameter_packs_r (tree *, int *, void*);
 };
 
 /* Identifies all of the argument packs that occur in a template
@@ -3968,6 +3992,37 @@ find_parameter_packs_r (tree *tp, int *walk_subtrees, 
void* data)
   *ppd->parameter_packs = tree_cons (NULL_TREE, t, *ppd->parameter_packs);
 }
 
+  if (!ppd->inside_extra_args_tree_p
+  && uses_extra_args

[PATCH] Fix PR driver/79181 (and others), not deleting some /tmp/cc* files for LTO.

2021-08-30 Thread apinski--- via Gcc-patches

From: Andrew Pinski 

So the main issue here is that some signals are not setup unlike collect2.
So this merges the setting up of the signal handlers to one function in
collect-utils and has collect2 and lto-wrapper call that function.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR driver/79181
* collect-utils.c (setup_signals): New declaration.
* collect-utils.h (setup_signals): New function.
* collect2.c (handler): Delete.
(main): Instead of manually setting up the signals,
just call setup_signals.
* lto-wrapper.c (main): Likewise.
---
 gcc/collect-utils.c | 37 +
 gcc/collect-utils.h |  1 +
 gcc/collect2.c  | 36 +---
 gcc/lto-wrapper.c   | 18 +-
 4 files changed, 40 insertions(+), 52 deletions(-)

diff --git a/gcc/collect-utils.c b/gcc/collect-utils.c
index 6b5d61d5162..19423d31885 100644
--- a/gcc/collect-utils.c
+++ b/gcc/collect-utils.c
@@ -57,6 +57,43 @@ fatal_signal (int signum)
  so its normal effect occurs.  */
   kill (getpid (), signum);
 }
+
+/* Setup the signal handlers for the utils. */
+void
+setup_signals (void)
+{
+#ifdef SIGQUIT
+  if (signal (SIGQUIT, SIG_IGN) != SIG_IGN)
+signal (SIGQUIT, fatal_signal);
+#endif
+  if (signal (SIGINT, SIG_IGN) != SIG_IGN)
+signal (SIGINT, fatal_signal);
+#ifdef SIGALRM
+  if (signal (SIGALRM, SIG_IGN) != SIG_IGN)
+signal (SIGALRM, fatal_signal);
+#endif
+#ifdef SIGHUP
+  if (signal (SIGHUP, SIG_IGN) != SIG_IGN)
+signal (SIGHUP, fatal_signal);
+#endif
+  if (signal (SIGSEGV, SIG_IGN) != SIG_IGN)
+signal (SIGSEGV, fatal_signal);
+  if (signal (SIGTERM, SIG_IGN) != SIG_IGN)
+signal (SIGTERM, fatal_signal);
+#ifdef SIGPIPE
+  if (signal (SIGPIPE, SIG_IGN) != SIG_IGN)
+signal (SIGPIPE, fatal_signal);
+#endif
+#ifdef SIGBUS
+  if (signal (SIGBUS, SIG_IGN) != SIG_IGN)
+signal (SIGBUS, fatal_signal);
+#endif
+#ifdef SIGCHLD
+  /* We *MUST* set SIGCHLD to SIG_DFL so that the wait4() call will
+ receive the signal.  A different setting is inheritable */
+  signal (SIGCHLD, SIG_DFL);
+#endif
+}
 
 /* Wait for a process to finish, and exit if a nonzero status is found.  */
 
diff --git a/gcc/collect-utils.h b/gcc/collect-utils.h
index 4f0e3ce9832..15f831d778a 100644
--- a/gcc/collect-utils.h
+++ b/gcc/collect-utils.h
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 extern void notice (const char *, ...)
   __attribute__ ((format (printf, 1, 2)));
 extern void fatal_signal (int);
+extern void setup_signals (void);
 
 extern struct pex_obj *collect_execute (const char *, char **,
const char *, const char *,
diff --git a/gcc/collect2.c b/gcc/collect2.c
index 07092c2733a..cf04a58ba4d 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -301,7 +301,6 @@ const char tool_name[] = "collect2";
 
 static symkind is_ctor_dtor (const char *);
 
-static void handler (int);
 static void maybe_unlink_list (char **);
 static void add_to_list (struct head *, const char *);
 static int extract_init_priority (const char *);
@@ -408,14 +407,6 @@ collect_atexit (void)
   tool_cleanup (false);
 }
 
-static void
-handler (int signo)
-{
-  tool_cleanup (true);
-
-  signal (signo, SIG_DFL);
-  raise (signo);
-}
 /* Notify user of a non-error, without translating the format string.  */
 void
 notice_translated (const char *cmsgid, ...)
@@ -907,11 +898,7 @@ main (int argc, char **argv)
   COLLECT2_HOST_INITIALIZATION;
 #endif
 
-#ifdef SIGCHLD
-  /* We *MUST* set SIGCHLD to SIG_DFL so that the wait4() call will
- receive the signal.  A different setting is inheritable */
-  signal (SIGCHLD, SIG_DFL);
-#endif
+  setup_signals ();
 
   /* Unlock the stdio streams.  */
   unlock_std_streams ();
@@ -1051,27 +1038,6 @@ main (int argc, char **argv)
   if (argc < 2)
 fatal_error (input_location, "no arguments");
 
-#ifdef SIGQUIT
-  if (signal (SIGQUIT, SIG_IGN) != SIG_IGN)
-signal (SIGQUIT, handler);
-#endif
-  if (signal (SIGINT, SIG_IGN) != SIG_IGN)
-signal (SIGINT, handler);
-#ifdef SIGALRM
-  if (signal (SIGALRM, SIG_IGN) != SIG_IGN)
-signal (SIGALRM, handler);
-#endif
-#ifdef SIGHUP
-  if (signal (SIGHUP, SIG_IGN) != SIG_IGN)
-signal (SIGHUP, handler);
-#endif
-  if (signal (SIGSEGV, SIG_IGN) != SIG_IGN)
-signal (SIGSEGV, handler);
-#ifdef SIGBUS
-  if (signal (SIGBUS, SIG_IGN) != SIG_IGN)
-signal (SIGBUS, handler);
-#endif
-
   /* Extract COMPILER_PATH and PATH into our prefix list.  */
   prefix_from_env ("COMPILER_PATH", &cpath);
   prefix_from_env ("PATH", &path);
diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index aae48aff100..903c258a03a 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -2125,23 +2125,7 @@ main (int argc, char *argv[])
   if (atexit (lto_wrapper_cleanup) != 0)
 fatal_error (input_location, "% failed");
 
-  if (signal (SIGINT, SIG_IGN) != S

Re: [PATCH] libffi: Fix MIPS r6 support

2021-08-30 Thread YunQiang Su

Jeff Law via Gcc-patches  于2021年8月30日周一 下午9:48写道：
>
>
>
> On 8/30/2021 2:47 AM, YunQiang Su wrote:
> > 在 2021/8/30 5:00, Jeff Law 写道:
> >>
> >>
> >> On 8/28/2021 1:23 AM, Xi Ruoyao wrote:
> >>> On Fri, 2021-08-27 at 15:28 -0600, Jeff Law via Gcc-patches wrote:
> 
>  On 8/26/2021 10:58 PM, YunQiang Su wrote:
> > for some instructions, MIPS r6 uses different encoding other than
> > the previous releases.
> >
> > 1. mips/n32.S disable .set mips4: since it casuses old insn encoding
> >  is used.
> >  https://github.com/libffi/libffi/pull/396
> > 2. mips/ffi.c: the encoding for JR is hardcoded: we need to use
> >  different value for r6 and pre-r6.
> >  https://github.com/libffi/libffi/pull/401
> >
> > libffi/
> >  PR libffi/83636
> >  * src/mips/n32.S: disable .set mips4
> >  * src/mips/ffi.c: use different JR encoding for r6.
>  These should go to the upstream libffi project.  Once accepted there
>  you
>  can add them to GCC.
> >>> Hi Jeff,
> >>>
> >>> The two PRs are already merged, and released since libffi-3.3.0 (now
> >>> the
> >>> upstream latest release is 3.4.2).
> >> ACK.  Thanks for confirming.  It's always OK to cherrypick/backport
> >> from libffi back to GCC.
> >>
> >>>
> >>> I don't have a MIPSr6 so I can't test though.
> >> Understood.   Me neither, but I really should get a tiny chroot for
> >> mipsr6 so that my tester can validate it regularly.
> >>
> >
> > We have port Debian to MIPS64r6el.
> > http://58.246.137.130:20180/tools/tarball/
> > You can use both buster(Debian 10) or bullseye (Debian 11).
> >
> > And both qemu-system and qemu-user can work.
> Understood.  I need the minimal chroot to compress down under 100M to

A minimal tarball is ready now with size about 87M.

> make gitlab happy (that's where my scripts & jenkins system expect to
> find them).  My scripts to build the chroots also ensure a consistent
> set of packages & version #s across the different systems being
> tested.   I've just never gotten around to building one for mips64r6.  I
> could probably take yours trim out unnecessary stuff and use it for
> now.  Getting a bootstrap test on the target weekly is definitely helpful.
>

That's great.
Since the minimal rootfs lacks of lots of tools, and
if you need some tools, you can just apt install them,then.

> jeff

Re: [PATCH] Set bound/cmp/control for until wrap loop.

2021-08-30 Thread guojiufu via Gcc-patches


On 2021-08-30 20:02, Richard Biener wrote:

On Mon, 30 Aug 2021, guojiufu wrote:


On 2021-08-30 14:15, Jiufu Guo wrote:
> Hi,
>
> In patch r12-3136, niter->control, niter->bound and niter->cmp are
> derived from number_of_iterations_lt.  While for 'until wrap condition',
> the calculation in number_of_iterations_lt is not align the requirements
> on the define of them and requirements in determine_exit_conditions.
>
> This patch calculate niter->control, niter->bound and niter->cmp in
> number_of_iterations_until_wrap.
>
> The ICEs in the PR are pass with this patch.
> Bootstrap and reg-tests pass on ppc64/ppc64le and x86.
> Is this ok for trunk?
>
> BR.
> Jiufu Guo
>
Add ChangeLog:
gcc/ChangeLog:

2021-08-30  Jiufu Guo  

PR tree-optimization/102087
* tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
Set bound/cmp/control for niter.

gcc/testsuite/ChangeLog:

2021-08-30  Jiufu Guo  

PR tree-optimization/102087
* gcc.dg/vect/pr101145_3.c: Update tests.
* gcc.dg/pr102087.c: New test.

> ---
>  gcc/tree-ssa-loop-niter.c  | 14 +-
>  gcc/testsuite/gcc.dg/pr102087.c| 25 +
>  gcc/testsuite/gcc.dg/vect/pr101145_3.c |  4 +++-
>  3 files changed, 41 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr102087.c
>
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 7af92d1c893..747f04d3ce0 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -1482,7 +1482,7 @@ number_of_iterations_until_wrap (class loop *,
> tree type, affine_iv *iv0,
> affine_iv *iv1, class tree_niter_desc *niter)
>  {
>tree niter_type = unsigned_type_for (type);
> -  tree step, num, assumptions, may_be_zero;
> +  tree step, num, assumptions, may_be_zero, span;
>wide_int high, low, max, min;
>
>may_be_zero = fold_build2 (LE_EXPR, boolean_type_node, iv1->base,
> iv0->base);
> @@ -1513,6 +1513,8 @@ number_of_iterations_until_wrap (class loop *,
> tree type, affine_iv *iv0,
>   low = wi::to_wide (iv0->base);
> else
>low = min;
> +
> +  niter->control = *iv1;
>  }
>/* {base, -C} < n.  */
>else if (tree_int_cst_sign_bit (iv0->step) && integer_zerop
> (iv1->step))
> @@ -1533,6 +1535,8 @@ number_of_iterations_until_wrap (class loop *,
> tree type, affine_iv *iv0,
>   high = wi::to_wide (iv1->base);
> else
>high = max;
> +
> +  niter->control = *iv0;
>  }
>else
>  return false;


it looks like the above two should already be in effect from the
caller (guarding with integer_nozerop)?


I add them just because set these fields in one function.
Yes, they have been set in caller already,  I could remove them here.




> @@ -1556,6 +1560,14 @@ number_of_iterations_until_wrap (class loop *,
> tree type, affine_iv *iv0,
>niter->assumptions, assumptions);
>
>niter->control.no_overflow = false;
> +  niter->control.base = fold_build2 (MINUS_EXPR, niter_type,
> +   niter->control.base,
> niter->control.step);


how do we know IVn - STEP doesn't already wrap?


The last IV value is just cross the max/min value of the type
at the last iteration,  then IVn - STEP is the nearest value
to max(or min) and not wrap.


A comment might be
good to explain you're turning the simplified exit condition into

   { IVbase - STEP, +, STEP } != niter * STEP + (IVbase - STEP)

which, when mathematically looking at it makes me wonder why there's
the seemingly redundant '- STEP' term?  Also is NE_EXPR really
correct since STEP might be not 1?  Only for non equality compares
the '- STEP' should matter?


I need to add comments for this.  This is a little tricky.
The last value of the original IV just cross max/min at most one STEP,
at there wrapping already happen.
Using "{IVbase, +, STEP} != niter * STEP + IVbase" is not wrong
in the aspect of exit condition.

But this would not work well with existing code:
like determine_exit_conditions, which will convert NE_EXP to
LT_EXPR/GT_EXPR.  And so, the '- STEP' is added to adjust the
IV.base and bound, with '- STEP' the bound will be the last value
just before wrap.

Thanks again for your review!

BR.
Jiufu



Richard.


> +  span = fold_build2 (MULT_EXPR, niter_type, niter->niter,
> +fold_convert (niter_type, niter->control.step));
> +  niter->bound = fold_build2 (PLUS_EXPR, niter_type, span,
> +fold_convert (niter_type, niter->control.base));
> +  niter->bound = fold_convert (type, niter->bound);
> +  niter->cmp = NE_EXPR;
>
>return true;
> }
> diff --git a/gcc/testsuite/gcc.dg/pr102087.c
> b/gcc/testsuite/gcc.dg/pr102087.c
> new file mode 100644
> index 000..ef1f9f5cba9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr102087.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +unsigned __attribute__ ((noinline))
> +foo (int *__restrict__

Re: [PATCH 19/34] rs6000: Handle overloads during program parsing

2021-08-30 Thread Bill Schmidt via Gcc-patches


Hi Segher,

On 8/27/21 6:07 PM, Segher Boessenkool wrote:

Hi!

On Thu, Jul 29, 2021 at 08:31:06AM -0500, Bill Schmidt wrote:

Although this patch looks quite large, the changes are fairly minimal.
Most of it is duplicating the large function that does the overload
resolution using the automatically generated data structures instead of
the old hand-generated ones.  This doesn't make the patch terribly easy to
review, unfortunately.

Yeah, and it pretty important that it does the same as the old code did.


Just be aware that generally we aren't changing
the logic and functionality of overload handling.

Good :-)


* config/rs6000/rs6000-c.c (rs6000-builtins.h): New include.
(altivec_resolve_new_overloaded_builtin): New forward decl.
(rs6000_new_builtin_type_compatible): New function.
(altivec_resolve_overloaded_builtin): Call
altivec_resolve_new_overloaded_builtin.
(altivec_build_new_resolved_builtin): New function.
(altivec_resolve_new_overloaded_builtin): Likewise.
* config/rs6000/rs6000-call.c (rs6000_new_builtin_is_supported_p):
Likewise.

No "_p" please (it already has a verb in the name, and explicit ones are
much clearer anyway).


OK.  This requires a small change to the gen program to match.


Does everything else belong in the C frontend file but this last
function not?  Maybe this could be split up better.  Maybe there should
be a separate file for just the builtin support, it probably is big
enough?

(This is all for later of course, but please think about it.  Code
rearrangement (or even refactoring) can be done at any time, there is
no time pressure on it).


Yes, it's a fair point to figure out whether this should be refactored 
better.  This particular function is used both by the overloading 
support and in other places, so it really belongs elsewhere, I think.  
In general, we might consider simplifying the target hooks that land in 
the C frontend file and move the guts into a separate builtin support 
file.  Certainly rs6000-call.c is still too large, and factoring out the 
builtins parts would be an improvement.  I'll put this on the list for 
follow-up after the main series lands.



+static tree
+altivec_resolve_new_overloaded_builtin (location_t, tree, void *);

This fits on one line, please do so (this is a declaration, not a
function definition; those are put with the name in the first column,
to make searching for them a lot easier).


+static bool
+rs6000_new_builtin_type_compatible (tree t, tree u)
+{
+  if (t == error_mark_node)
+return false;
+
+  if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (u))
+return true;
+
+  if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
+  && is_float128_p (t) && is_float128_p (u))

Indent is wrong here?


Yeah, how'd that happen...



+  if (POINTER_TYPE_P (t) && POINTER_TYPE_P (u))
+{
+  t = TREE_TYPE (t);
+  u = TREE_TYPE (u);
+  if (TYPE_READONLY (u))
+   t = build_qualified_type (t, TYPE_QUAL_CONST);
+}

Just use TYPE_MAIN_VARIANT, don't make garbage here?


I don't understand how TYPE_MAIN_VARIANT is appropriate.  Please explain?



+  /* The AltiVec overloading implementation is overall gross, but this
+ is particularly disgusting.  The vec_{all,any}_{ge,le} builtins
+ are completely different for floating-point vs. integer vector
+ types, because the former has vcmpgefp, but the latter should use
+ vcmpgtXX.

Yes.  The integer comparisons were reduced to just two in original VMX
(eq and gt, well signed and unsigned versions of the latter), but that
cannot be done for floating point (all of ab can be false at
the same time (for a or b NaN), so this cannot be compressed to just two
functions, we really need three at least).

We have the same thing in xscmp* btw, it's not just VMX.  Having three
ops for the majority of comparisons (and doing the rest with two or so)
is nicer than having 14 :-)

There aren't builtins for most of that, thankfully.


+  if (TARGET_DEBUG_BUILTIN)
+fprintf (stderr, "altivec_resolve_overloaded_builtin, code = %4d, %s\n",
+(int)fcode, IDENTIFIER_POINTER (DECL_NAME (fndecl)));

(space after cast, also in debug code)


+ const char *name
+   = fcode == RS6000_OVLD_VEC_ADDE ? "vec_adde": "vec_sube";

(space before and after colon)


Yeah, there is a lot of bad formatting in the copied code.  I probably 
should have tried to clean it all up, but generally just did where I was 
making specific changes.  I'll see what I can do about these things.



+ const char *name = fcode == RS6000_OVLD_VEC_ADDEC ?
+   "vec_addec": "vec_subec";

(ditto.  also, ? cannot end a line.  maybe just
  const char *name;
  name = fcode == RS6000_OVLD_VEC_ADDEC ? "vec_addec" : "vec_subec";
)


+  const char *name
+   = fcode == RS6000_OVLD_VEC_SPLATS ? "vec_splats": "vec_promote";

(more)


+ case E_SFmode: type = V4SF_type_node; size = 4; break;

[PATCH] Fix gcc.dg/ipa/inline-8.c for -fPIC

2021-08-30 Thread apinski--- via Gcc-patches

From: Andrew Pinski 

The problem here is with -fPIC, both cmp and move
don't bind locally so they are not even tried to be
inlined.  This fixes the issue by marking both
functions as static and now the testcase passes
for both -fPIC and -fno-PIC cases.

OK? Tested on x86_64-linux-gnu.

gcc/testsuite/ChangeLog:

* gcc.dg/ipa/inline-8.c: Mark cmp and move as
static so they both bind local and available for
inlinine.
---
 gcc/testsuite/gcc.dg/ipa/inline-8.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/ipa/inline-8.c 
b/gcc/testsuite/gcc.dg/ipa/inline-8.c
index 388283ca213..c51eec20fc8 100644
--- a/gcc/testsuite/gcc.dg/ipa/inline-8.c
+++ b/gcc/testsuite/gcc.dg/ipa/inline-8.c
@@ -6,13 +6,13 @@
 #include 
 extern int isnanf (float);
 /* Can't be inlined because isnanf will be optimized out.  */
-int
+static int
 cmp (float a)
 {
   return isnanf (a);
 }
 /* Can be inlined.  */
-int
+static int
 move (int a)
 {
   return a;
-- 
2.17.1

Re: [PING #2][PATCH] enable ranger and caching in pass_waccess

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, Aug 30, 2021 at 4:27 PM Martin Sebor  wrote:
>
> Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578135.html
>
> Are there any further comments on the patch?
>
> Richard, you were kind enough to review the first two patches in this
> series.  Would you mind doing the same for this one?  It continues in
> the same vein.

OK.

Richard.

> Martin
>
> On 8/25/21 3:26 PM, Martin Sebor wrote:
> > On 8/25/21 10:14 AM, Andrew MacLeod wrote:
> >> On 8/25/21 11:20 AM, Martin Sebor wrote:
> >>> Ping: Andrew, did I answer your questions?  Do you (or anyone else)
> >>> have any other comments on the latest patch below?
> >>>
> >>> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577865.html
> >>
> >>
> >> I wasn't attempting to block it, its outside my review purview..
> >>
> >> I was merely commenting that you should not need a pointer to a
> >> range_query at all anymore
> >>
> >>
> >>>
> 
> >
> > Are you planning to transition to using the get_range_query()
> > interface instead of keeping a range_query pointer in the
> > pointer_query class?
> 
>  This pass and to a smaller extent the pointer_query class that's
>  used by it and the strlen pass are still a work in progress.
>  I also still need to convert the strlen pass to use Ranger and
>  I expect it will take some changes to pointer_query.  So at that
>  point, if going through get_range_query (cfun) everywhere is what
>  you recommend, I'm happy to do it.
> 
> >> absolutely. you should not need to even know whether you have a ranger
> >> instance running or not. get_range_query will give you whichever is
> >> active, and there is ALWAYS something active.. defaulting to the
> >> global version.
> >>
> >> the code in get_range() seems to be the end of the call chain which
> >> uses the pointer and should be consolidated to something much simpler
> >>
> >>   if (rvals && stmt)
> >>  {
> >>if (!rvals->range_of_expr (vr, val, stmt))
> >>  return NULL_TREE;
> >>
> >>  <...>
> >>
> >>// ?? This entire function should use get_range_query or
> >> get_global_range_query (),
> >>// instead of doing something different for RVALS and global
> >> ranges.
> >>
> >>if (!get_global_range_query ()->range_of_expr (vr, val) ||
> >> vr.undefined_p ())
> >>  return NULL_TREE;
> >>
> >>
> >> This entire section can boil down to something like
> >>
> >> if (!get_range_query (cfun)->range_of_expr (vr, val, stmt))
> >>return NULL;
> >
> > So get_range_query (cfun) will never be null?  And no special handling
> > of INTEGER_CST is needed, or checking for SSA_NAME.  Nice.  Attached
> > is another revision of the same patch with this function simplified.
> >
> > (FYI: I expect the whole function to eventually go away, and to make
> > other similar simplifications, but I'm not there yet.)
> >
>  PS There has been an effort to get rid of global variables from GCC,
>  or, as the first step, to avoid accessing them directly(*).  If and
>  when that happens, it seems like each pass will have to store either
>  the ranger instance as a member (directly or indirectly, via a member
>  of a class that stores it) or the function passed to pass::execute()
>  if it wants to access either.
> 
>  [*] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573975.html
>  The patch at the link above wasn't approved but IIUC removing globals
>  from GCC is still a goal.
> >>>
> >> I have no idea what direction that is going, but the net effect will
> >> be the same in the end.  You'll be calling get_range_query() with
> >> either the function pointer, or the pass pointer, or something.. but
> >> you should never need to pass around  a pointer to either a ranger or
> >> range-query. As I said earlier, this will likely be a pass property
> >> and accessed thru the pass eventually... but until then, use what we
> >> have.. cfun.  It'll be trivial to transition down the road to whatever
> >> the ultimate solution is.
> >
> > Okay, I'll do that in my future changes.
> >
> > Martin
>

Re: [PATCH] gdb: Add a dependency between gdb and libbacktrace

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, Aug 30, 2021 at 10:22 PM Andrew Burgess
 wrote:
>
> I plan to make use of libbacktrace within GDB.  I believe that the
> patch below needs to be merged into GCCs toplevel directory and then
> back-ported to the binutils-gdb repository.
>
> Is this OK to merge?

OK.

> Thanks,
> Andrew
>
> ---
>
> GDB is going to start using libbacktrace, so add a build dependency
> between the two projects.  This change needs to be added into the GCC
> toplevel files, and then back-ported to the binutils-gdb repository.
>
> ChangeLog:
>
> * Makefile.def: Add all-gdb dependency on all-libbacktrace.
> * Makefile.in: Regenerate.
> ---
>  ChangeLog| 5 +
>  Makefile.def | 1 +
>  Makefile.in  | 1 +
>  3 files changed, 7 insertions(+)
>
> diff --git a/Makefile.def b/Makefile.def
> index fbfdb6fee08..de3e0052106 100644
> --- a/Makefile.def
> +++ b/Makefile.def
> @@ -423,6 +423,7 @@ dependencies = { module=all-gdb; on=all-sim; };
>  dependencies = { module=all-gdb; on=all-libdecnumber; };
>  dependencies = { module=all-gdb; on=all-libtermcap; };
>  dependencies = { module=all-gdb; on=all-libctf; };
> +dependencies = { module=all-gdb; on=all-libbacktrace; };
>
>  // Host modules specific to gdbserver.
>  dependencies = { module=configure-gdbserver; on=all-gnulib; };
> diff --git a/Makefile.in b/Makefile.in
> index 5c85fcc5dd0..61af99dc75a 100644
> --- a/Makefile.in
> +++ b/Makefile.in
> @@ -61237,6 +61237,7 @@ all-gdb: maybe-all-libiconv
>  all-gdb: maybe-all-opcodes
>  all-gdb: maybe-all-libdecnumber
>  all-gdb: maybe-all-libctf
> +all-gdb: maybe-all-libbacktrace
>  all-gdbserver: maybe-all-libiberty
>  configure-gdbsupport: maybe-configure-intl
>  all-gdbsupport: maybe-all-intl
> --
> 2.25.4
>

Re: [PATCH] Fix PR 90142: contrib/download_prerequisites uses test ==

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, Aug 30, 2021 at 10:49 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> Since == is not portable, it is better to use = in contrib/
> download_prerequisites.  The only place == was used is inside
> the function md5_check which is used only on Mac OS X.
>
> Tested on Mac OS X as:
> ./contrib/download_prerequisites --md5
> Both with all files having the correct checksum and one with a broken one.

Ok.

> contrib/ChangeLog:
>
> * download_prerequisites (md5_check): Replace == inside
> test with = to be more portable.
> ---
>  contrib/download_prerequisites | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites
> index 8f69b61f5a9..11c283ecb1a 100755
> --- a/contrib/download_prerequisites
> +++ b/contrib/download_prerequisites
> @@ -113,7 +113,7 @@ md5_check() {
>md5_checksum_output=$(md5 -r "${file_to_check}")
># Grab the text before the first space
>md5_checksum_detected="${md5_checksum_output%% *}"
> -  [ "${md5_checksum_expected}" == "${md5_checksum_detected}" ] \
> +  [ "${md5_checksum_expected}" = "${md5_checksum_detected}" ] \
>  || die "Cannot verify integrity of possibly corrupted file 
> ${file_to_check}"
>echo "${file_to_check}: OK"
>  }
> --
> 2.17.1
>

Re: *PING**2 – Re: [Patch] Fortran: Fix Bind(C) char-len check, add ptr-contiguous check

2021-08-30 Thread Paul Richard Thomas via Gcc-patches

Hi Tobias,

s/However, as argument they are also iteroperable/However, as an argument
they are also interoperable/

s/ /* else: valid only sind F2018 - and an assumed-shape/rank
array; however, gfc_notify_std is already called when
those array type are used. Thus, silently accept F200x. */ /
  /* else: valid only since F2018 - and an assumed-shape/rank
array; however, gfc_notify_std is already called when
those array types are used. Thus, silently accept F200x. */

Apart from those nits, it looks good to me. It even regtests OK :-)

Thanks for sorting out the standard-ese. OK for mainline and, I would
suggest 11-branch.

Cheers

Paul


On Sun, 29 Aug 2021 at 08:35, Tobias Burnus  wrote:

> PING**2
>
> On 25.08.21 20:58, Tobias Burnus wrote:
> > Early *PING*.
> > (I also should still review several Fortan patches... There are lots of
> > patches waiting for review :-/)
> >
> > On 20.08.21 19:24, Tobias Burnus wrote:
> >> The following is about interoperability (BIND(C)) only.
> >>
> >>
> >> * The patch adds a missing check for pointer + contiguous.
> >> (Rejected to avoid copy-in issues? Or checking issues?)
> >>
> >>
> >> * And it corrects an issue regarding len > 1 characters. While
> >>
> >>  subroutine foo(x)
> >> character(len=2) :: x(*)
> >>
> >> is valid Fortran code (the argument can be "abce" or ['a','b','c','d']
> >> or ...)
> >> – and would work also with bind(C) as the len=2 does not need to be
> >> passed
> >> as hidden argument as len is a constant.
> >> However, it is not valid nonetheless.
> >>
> >>
> >> OK? Comments?
> >>
> >> Tobias
> >>
> >>
> >> PS: Referenced locations in the standard (F2018):
> >>
> >> C1554 If proc-language-binding-spec is specified for a procedure,
> >> each of its dummy arguments shall be an interoperable procedure (18.3.6)
> >> or a variable that is interoperable (18.3.4, 18.3.5), assumed-shape,
> >> assumed-rank, assumed-type, of type CHARACTER with assumed length,
> >> or that has the ALLOCATABLE or POINTER attribute.
> >>
> >> 18.3.1: "... If the type is character, the length type parameter is
> >> interoperable if and only if its value is one. ..."
> >>
> >> "18.3.4 Interoperability of scalar variables":
> >> "... A named scalar Fortran variable is interoperable ... if it
> >> is of type character12its length is not assumed or declared by
> >> an expression that is not a constant expression."
> >>
> >> 18.3.5: Likewise but for arrays.
> >>
> >> 18.3.6 "... Fortran procedure interface is interoperable with a C
> >> function prototype ..."
> >> "(5) any dummy argument without the VALUE attribute corresponds
> >>  to a formal parameter of the prototype that is of a pointer type,
> >> and either
> >>  • the dummy argument is interoperable with an entity of the
> >> referenced type ..."
> >> (Remark: those are passed as byte stream)
> >>  "• the dummy argument is a nonallocatable nonpointer variable of
> >> type
> >> CHARACTER with assumed character length and the formal
> >> parameter is
> >> a pointer to CFI_cdesc_t,
> >>   • the dummy argument is allocatable, assumed-shape,
> >> assumed-rank, or
> >> a pointer without the CONTIGUOUS attribute, and the formal
> >> parameter
> >> is a pointer to CFI_cdesc_t, or
> >> (Remark: those two use an array descriptor, also for
> >> explicit-size/assumed-size
> >> arrays or for scalars.)
> >>   • the dummy argument is assumed-type ..."
> >>
> > -
> > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße
> > 201, 80634 München; Gesellschaft mit beschränkter Haftung;
> > Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft:
> > München; Registergericht München, HRB 106955
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-30 Thread Richard Biener via Gcc-patches

On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu  wrote:
>
> On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
> >  wrote:
> > >
> > > Richard Biener via Gcc-patches  writes:
> > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > > >  wrote:
> > > >>
> > > >> Richard Biener via Gcc-patches  writes:
> > > >> > One thought I had is whether we can "fix" validate_subreg to have 
> > > >> > less
> > > >> > "weird" allowed float-int
> > > >> > special cases.  As said upthread I think that we either should allow
> > > >> > all of those, implying that
> > > >> > subregs work semantically as if there's subregs to same-sized integer
> > > >> > modes inbetween or
> > > >> > disallow them all and make sure we're actually doing that 
> > > >> > explicitely.
> > > >> >
> > > >> > For example
> > > >> >
> > > >> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though 
> > > >> > store_bit_field
> > > >> >  is the culprit here, and not the backends.  */
> > > >> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > >> > ;
> > > >> >
> > > >> > I can't decipther rtl.text as to what the semantics of such a subreg 
> > > >> > is
> > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> > > >> > when you mix those in a subreg.  So maybe the above should
> > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> > > >> >
> > > >> > But then the world would be much simpler if subregs of non-same size
> > > >> > modes have explicit documentation for the mode kinds we have.
> > > >>
> > > >> Yeah.  Although validate_subreg was a good idea, some of the mode 
> > > >> checks
> > > >> are IMO a failed experiment.  The hope was that eventually we'd remove
> > > >> all those special exceptions once the culprit has been fixed.  However,
> > > >> the code is over 16 years old at this point and those changes never
> > > >> happened.
> > > >>
> > > >> Nested subregs aren't a thing (thankfully) and one of the big 
> > > >> disadvantages
> > > >> of the current validate_subreg mode-changing rules is that they aren't
> > > >> transitive.  This can artificially require temporary pseudos for things
> > > >> that could be expressed directly as a single subreg.
> > > >
> > > > And that's what the proposed patch does (add same-mode size integer mode
> > > > punning intermediate subregs).
> > > >
> > > > So if that's not supposed to be necessary then why restrict subregs at 
> > > > all?
> > >
> > > I was trying to say: I'm not sure we should.
> > >
> > > > I mean you seem to imply that the semantics would be clear and 
> > > > well-defined
> > > > (to you - not to me).  The only thing is that of course not all subregs 
> > > > are
> > > > "implemented" by a target (or can be, w/o spilling).
> > >
> > > Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
> > > But it only comes in to play during RA or when trying to take
> > > the subreg of a particular hard register.  Transitivity doesn't
> > > matter so much for the hard register case since the result of
> > > simplify_gen_subreg should then be another hard register.
> > >
> > > > Which means - we should adjust validate_subreg with another special-case
> > > > or rather generalize the existing ones to an overall set that makes more
> > > > sense?
> > >
> > > Maybe it's too radical, but I would whether we should just get rid of:
> > >
> > >   /* ??? This should not be here.  Temporarily continue to allow word_mode
> > >  subregs of anything.  The most common offender is (subreg:SI 
> > > (reg:DF)).
> > >  Generally, backends are doing something sketchy but it'll take time 
> > > to
> > >  fix them all.  */
> > >   if (omode == word_mode)
> > > ;
> > >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though 
> > > store_bit_field
> > >  is the culprit here, and not the backends.  */
> > >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > ;
> > >   /* Allow component subregs of complex and vector.  Though given the 
> > > below
> > >  extraction rules, it's not always clear what that means.  */
> > >   else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> > >&& GET_MODE_INNER (imode) == omode)
> > > ;
> > >   /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
> > >  i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> > >  surely isn't the cleanest way to represent this.  It's questionable
> > >  if this ought to be represented at all -- why can't this all be 
> > > hidden
> > >  in post-reload splitters that make arbitrarily mode changes to the
> > >  registers themselves.  */
> > >   else if (VECTOR_MODE_P (omode)
> > >&& GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
> > > ;
> > >   /* Subregs involving floating point modes are not

Re: [PATCH] Fix x86/56337 : 1<<28 alignment is broken

2021-08-30 Thread Richard Biener via Gcc-patches

On Mon, Aug 30, 2021 at 11:38 PM Andrew Pinski via Gcc-patches
 wrote:
>
> On Fri, Jul 23, 2021 at 1:33 PM apinski--- via Gcc-patches
>  wrote:
> >
> > From: Andrew Pinski 
> >
> > The problem here is the x86_64 back-end uses a signed integer
> > for alignment and then divides by BITS_PER_UNIT so if we had
> > INT_MIN (which is what 1<<28*8 is), we would get the wrong result.
> >
> > This fixes the problem by using unsigned for the argument to
> > x86_output_aligned_bss and x86_output_aligned_bss.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu.
>
> Ping?

OK.

Thanks,
Richard.

> >
> > gcc/ChangeLog:
> >
> > PR target/56337
> > * config/i386/i386-protos.h (x86_output_aligned_bss):
> > Change align argument to unsigned type.
> > (x86_elf_aligned_decl_common): Likewise.
> > * config/i386/i386.c (x86_elf_aligned_decl_common): Likewise.
> > (x86_output_aligned_bss): Likewise.
> > ---
> >  gcc/config/i386/i386-protos.h | 4 ++--
> >  gcc/config/i386/i386.c| 4 ++--
> >  2 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> > index 07ac02aff69..fa1b0d0d787 100644
> > --- a/gcc/config/i386/i386-protos.h
> > +++ b/gcc/config/i386/i386-protos.h
> > @@ -325,9 +325,9 @@ struct ix86_address
> >  extern int ix86_decompose_address (rtx, struct ix86_address *);
> >  extern int memory_address_length (rtx, bool);
> >  extern void x86_output_aligned_bss (FILE *, tree, const char *,
> > -   unsigned HOST_WIDE_INT, int);
> > +   unsigned HOST_WIDE_INT, unsigned);
> >  extern void x86_elf_aligned_decl_common (FILE *, tree, const char *,
> > -unsigned HOST_WIDE_INT, int);
> > +unsigned HOST_WIDE_INT, unsigned);
> >
> >  #ifdef RTX_CODE
> >  extern void ix86_fp_comparison_codes (enum rtx_code code, enum rtx_code *,
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 876a19f4c1f..f86d11dfb11 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -837,7 +837,7 @@ x86_64_elf_unique_section (tree decl, int reloc)
> >  void
> >  x86_elf_aligned_decl_common (FILE *file, tree decl,
> > const char *name, unsigned HOST_WIDE_INT size,
> > -   int align)
> > +   unsigned align)
> >  {
> >if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC)
> >&& size > (unsigned int)ix86_section_threshold)
> > @@ -858,7 +858,7 @@ x86_elf_aligned_decl_common (FILE *file, tree decl,
> >
> >  void
> >  x86_output_aligned_bss (FILE *file, tree decl, const char *name,
> > -   unsigned HOST_WIDE_INT size, int align)
> > +   unsigned HOST_WIDE_INT size, unsigned align)
> >  {
> >if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC)
> >&& size > (unsigned int)ix86_section_threshold)
> > --
> > 2.17.1
> >

Re: [PATCH] Fix PR driver/79181 (and others), not deleting some /tmp/cc* files for LTO.

2021-08-30 Thread Richard Biener via Gcc-patches

On Tue, Aug 31, 2021 at 4:17 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> So the main issue here is that some signals are not setup unlike collect2.
> So this merges the setting up of the signal handlers to one function in
> collect-utils and has collect2 and lto-wrapper call that function.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

> gcc/ChangeLog:
>
> PR driver/79181
> * collect-utils.c (setup_signals): New declaration.
> * collect-utils.h (setup_signals): New function.
> * collect2.c (handler): Delete.
> (main): Instead of manually setting up the signals,
> just call setup_signals.
> * lto-wrapper.c (main): Likewise.
> ---
>  gcc/collect-utils.c | 37 +
>  gcc/collect-utils.h |  1 +
>  gcc/collect2.c  | 36 +---
>  gcc/lto-wrapper.c   | 18 +-
>  4 files changed, 40 insertions(+), 52 deletions(-)
>
> diff --git a/gcc/collect-utils.c b/gcc/collect-utils.c
> index 6b5d61d5162..19423d31885 100644
> --- a/gcc/collect-utils.c
> +++ b/gcc/collect-utils.c
> @@ -57,6 +57,43 @@ fatal_signal (int signum)
>   so its normal effect occurs.  */
>kill (getpid (), signum);
>  }
> +
> +/* Setup the signal handlers for the utils. */
> +void
> +setup_signals (void)
> +{
> +#ifdef SIGQUIT
> +  if (signal (SIGQUIT, SIG_IGN) != SIG_IGN)
> +signal (SIGQUIT, fatal_signal);
> +#endif
> +  if (signal (SIGINT, SIG_IGN) != SIG_IGN)
> +signal (SIGINT, fatal_signal);
> +#ifdef SIGALRM
> +  if (signal (SIGALRM, SIG_IGN) != SIG_IGN)
> +signal (SIGALRM, fatal_signal);
> +#endif
> +#ifdef SIGHUP
> +  if (signal (SIGHUP, SIG_IGN) != SIG_IGN)
> +signal (SIGHUP, fatal_signal);
> +#endif
> +  if (signal (SIGSEGV, SIG_IGN) != SIG_IGN)
> +signal (SIGSEGV, fatal_signal);
> +  if (signal (SIGTERM, SIG_IGN) != SIG_IGN)
> +signal (SIGTERM, fatal_signal);
> +#ifdef SIGPIPE
> +  if (signal (SIGPIPE, SIG_IGN) != SIG_IGN)
> +signal (SIGPIPE, fatal_signal);
> +#endif
> +#ifdef SIGBUS
> +  if (signal (SIGBUS, SIG_IGN) != SIG_IGN)
> +signal (SIGBUS, fatal_signal);
> +#endif
> +#ifdef SIGCHLD
> +  /* We *MUST* set SIGCHLD to SIG_DFL so that the wait4() call will
> + receive the signal.  A different setting is inheritable */
> +  signal (SIGCHLD, SIG_DFL);
> +#endif
> +}
>
>  /* Wait for a process to finish, and exit if a nonzero status is found.  */
>
> diff --git a/gcc/collect-utils.h b/gcc/collect-utils.h
> index 4f0e3ce9832..15f831d778a 100644
> --- a/gcc/collect-utils.h
> +++ b/gcc/collect-utils.h
> @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
>  extern void notice (const char *, ...)
>__attribute__ ((format (printf, 1, 2)));
>  extern void fatal_signal (int);
> +extern void setup_signals (void);
>
>  extern struct pex_obj *collect_execute (const char *, char **,
> const char *, const char *,
> diff --git a/gcc/collect2.c b/gcc/collect2.c
> index 07092c2733a..cf04a58ba4d 100644
> --- a/gcc/collect2.c
> +++ b/gcc/collect2.c
> @@ -301,7 +301,6 @@ const char tool_name[] = "collect2";
>
>  static symkind is_ctor_dtor (const char *);
>
> -static void handler (int);
>  static void maybe_unlink_list (char **);
>  static void add_to_list (struct head *, const char *);
>  static int extract_init_priority (const char *);
> @@ -408,14 +407,6 @@ collect_atexit (void)
>tool_cleanup (false);
>  }
>
> -static void
> -handler (int signo)
> -{
> -  tool_cleanup (true);
> -
> -  signal (signo, SIG_DFL);
> -  raise (signo);
> -}
>  /* Notify user of a non-error, without translating the format string.  */
>  void
>  notice_translated (const char *cmsgid, ...)
> @@ -907,11 +898,7 @@ main (int argc, char **argv)
>COLLECT2_HOST_INITIALIZATION;
>  #endif
>
> -#ifdef SIGCHLD
> -  /* We *MUST* set SIGCHLD to SIG_DFL so that the wait4() call will
> - receive the signal.  A different setting is inheritable */
> -  signal (SIGCHLD, SIG_DFL);
> -#endif
> +  setup_signals ();
>
>/* Unlock the stdio streams.  */
>unlock_std_streams ();
> @@ -1051,27 +1038,6 @@ main (int argc, char **argv)
>if (argc < 2)
>  fatal_error (input_location, "no arguments");
>
> -#ifdef SIGQUIT
> -  if (signal (SIGQUIT, SIG_IGN) != SIG_IGN)
> -signal (SIGQUIT, handler);
> -#endif
> -  if (signal (SIGINT, SIG_IGN) != SIG_IGN)
> -signal (SIGINT, handler);
> -#ifdef SIGALRM
> -  if (signal (SIGALRM, SIG_IGN) != SIG_IGN)
> -signal (SIGALRM, handler);
> -#endif
> -#ifdef SIGHUP
> -  if (signal (SIGHUP, SIG_IGN) != SIG_IGN)
> -signal (SIGHUP, handler);
> -#endif
> -  if (signal (SIGSEGV, SIG_IGN) != SIG_IGN)
> -signal (SIGSEGV, handler);
> -#ifdef SIGBUS
> -  if (signal (SIGBUS, SIG_IGN) != SIG_IGN)
> -signal (SIGBUS, handler);
> -#endif
> -
>/* Extract COMPILER_PATH and PATH into our prefix list.  */
>prefix_from_env ("COMPILER_PATH", &cpath);
>

Re: [PATCH] Fix gcc.dg/ipa/inline-8.c for -fPIC

2021-08-30 Thread Richard Biener via Gcc-patches

On Tue, Aug 31, 2021 at 7:41 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> The problem here is with -fPIC, both cmp and move
> don't bind locally so they are not even tried to be
> inlined.  This fixes the issue by marking both
> functions as static and now the testcase passes
> for both -fPIC and -fno-PIC cases.
>
> OK? Tested on x86_64-linux-gnu.

OK

> gcc/testsuite/ChangeLog:
>
> * gcc.dg/ipa/inline-8.c: Mark cmp and move as
> static so they both bind local and available for
> inlinine.
> ---
>  gcc/testsuite/gcc.dg/ipa/inline-8.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/ipa/inline-8.c 
> b/gcc/testsuite/gcc.dg/ipa/inline-8.c
> index 388283ca213..c51eec20fc8 100644
> --- a/gcc/testsuite/gcc.dg/ipa/inline-8.c
> +++ b/gcc/testsuite/gcc.dg/ipa/inline-8.c
> @@ -6,13 +6,13 @@
>  #include 
>  extern int isnanf (float);
>  /* Can't be inlined because isnanf will be optimized out.  */
> -int
> +static int
>  cmp (float a)
>  {
>return isnanf (a);
>  }
>  /* Can be inlined.  */
> -int
> +static int
>  move (int a)
>  {
>return a;
> --
> 2.17.1
>

[PATCH] PR middle-end/100810: Penalize IV candidates with undefined value bases

2021-08-30 Thread Roger Sayle


Time to hopefully earn some goodwill from the team; this patch fixes
a P1 wrong-code-on-valid regression in ivopts.  Many thanks to Andrew
Pinski for help with the analysis.

Consider the code fragment below:

int i;
for (j=0; j<10; j++)
  i++;

This results in a loop containing two induction variables, i and j,
where j is initialized, but i isn't (typically indicated in tree dumps
by qualified ssa names like i(D).  In PR 100810, the loop optimizers
end up selecting i as the "best" candidate (perhaps because having no
initialization it's cheaper) which leads to problems in later passes
when (the equivalent of) j is considered not to have the value 10 after
the loop, as its definition is now computed from an undefined/uninitialized
value.

The fix below is to add a field to track whether any "iv_group" contains
a use of an iv based on a undefined value, and then prohibit IVs that are
based on undefined values from being candidates for groups that don't use
undefined values.  This may seem lenient, but it allows an IV with an
undefined base to be a candidate for itself, and is a sufficient condition
to avoid the above bug/regression.  A stricter condition might be to only
allow "undefined_value iv"s as candidates for iv_groups where *all*
uses are to "undefined_value ivs"?  My concern was that this might lead
to cases/loops that no longer have suitable candidates (i.e. a possible
performance regression).

Hopefully, the tree-loop optimization experts agree with my analysis/fix.

This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
and "make -k check" with no new failures.  Ok for mainline?

2021-08-31  Roger Sayle  
Andrew Pinski  

gcc/ChangeLog
PR middle-end/100810
* tree-ssa-loop-ivopts.c (struct iv_group): Add a new
uses_undefined_value_p field, indicating this group has
uses of iv's whose base is ssa_undefined_value_p.
(record_use): Update uses_undefined_value_p as required.
(record_group): Initialize uses_undefined_value_p to false.
(determine_group_iv_cost_generic): Consider a candidate with
a ssa_undefined_value_p base to have infinite_cost for a
group where uses_undefined_value_p is false.

gcc/testsuite/ChangeLog
PR middle-end/100810
* gcc.dg/pr100810.c: New test case.

Roger
--

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 4a498ab..ca8f526 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -432,6 +432,8 @@ struct iv_group
   struct iv_cand *selected;
   /* To indicate this is a doloop use group.  */
   bool doloop_p;
+  /* This group uses undefined values.  */
+  bool uses_undefined_value_p;
   /* Uses in the group.  */
   vec vuses;
 };
@@ -1540,6 +1542,12 @@ record_use (struct iv_group *group, tree *use_p, struct 
iv *iv,
   use->addr_offset = addr_offset;
 
   group->vuses.safe_push (use);
+
+  /* Record uses of undefined values.  */
+  if (TREE_CODE (iv->base) == SSA_NAME
+  && ssa_undefined_value_p (iv->base))
+group->uses_undefined_value_p = true;
+
   return use;
 }
 
@@ -1582,6 +1590,7 @@ record_group (struct ivopts_data *data, enum use_type 
type)
   group->related_cands = BITMAP_ALLOC (NULL);
   group->vuses.create (1);
   group->doloop_p = false;
+  group->uses_undefined_value_p = false;
 
   data->vgroups.safe_push (group);
   return group;
@@ -4960,6 +4969,12 @@ determine_group_iv_cost_generic (struct ivopts_data 
*data,
  the candidate.  */
   if (cand->pos == IP_ORIGINAL && cand->incremented_at == use->stmt)
 cost = no_cost;
+  /* Disallow using an iv based on an undefined value as a candidate
+ replacement for a group that uses only defined values.  */
+  else if (!group->uses_undefined_value_p
+  && TREE_CODE (cand->iv->base) == SSA_NAME
+  && ssa_undefined_value_p (cand->iv->base))
+cost = infinite_cost;
   else
 cost = get_computation_cost (data, use, cand, false,
 &inv_vars, NULL, &inv_expr);
/* { dg-do run } */
/* { dg-options "-O2" } */

int a, b = 1, c = 1, e, f = 1, g, h, j;
volatile int d;
static void k() {
  int i;
  h = b;
  if (c && a >= 0) {
while (a) {
  i++;
  h--;
}
if (g)
  for (h = 0; h < 2; h++)
;
if (!b)
  i &&d;
  }
}
static void l() {
  for (; j < 1; j++)
if (!e && c && f)
  k();
}
int main() {
  if (f)
l();
  if (h != 1)
__builtin_abort();
  return 0;
}

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-30 Thread Hongtao Liu via Gcc-patches

On Tue, Aug 31, 2021 at 2:11 PM Richard Biener
 wrote:
>
> On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu  wrote:
> >
> > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
> >  wrote:
> > >
> > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
> > >  wrote:
> > > >
> > > > Richard Biener via Gcc-patches  writes:
> > > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > > > >  wrote:
> > > > >>
> > > > >> Richard Biener via Gcc-patches  writes:
> > > > >> > One thought I had is whether we can "fix" validate_subreg to have 
> > > > >> > less
> > > > >> > "weird" allowed float-int
> > > > >> > special cases.  As said upthread I think that we either should 
> > > > >> > allow
> > > > >> > all of those, implying that
> > > > >> > subregs work semantically as if there's subregs to same-sized 
> > > > >> > integer
> > > > >> > modes inbetween or
> > > > >> > disallow them all and make sure we're actually doing that 
> > > > >> > explicitely.
> > > > >> >
> > > > >> > For example
> > > > >> >
> > > > >> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though 
> > > > >> > store_bit_field
> > > > >> >  is the culprit here, and not the backends.  */
> > > > >> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > > >> > ;
> > > > >> >
> > > > >> > I can't decipther rtl.text as to what the semantics of such a 
> > > > >> > subreg is
> > > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> > > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> > > > >> > when you mix those in a subreg.  So maybe the above should
> > > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> > > > >> >
> > > > >> > But then the world would be much simpler if subregs of non-same 
> > > > >> > size
> > > > >> > modes have explicit documentation for the mode kinds we have.
> > > > >>
> > > > >> Yeah.  Although validate_subreg was a good idea, some of the mode 
> > > > >> checks
> > > > >> are IMO a failed experiment.  The hope was that eventually we'd 
> > > > >> remove
> > > > >> all those special exceptions once the culprit has been fixed.  
> > > > >> However,
> > > > >> the code is over 16 years old at this point and those changes never
> > > > >> happened.
> > > > >>
> > > > >> Nested subregs aren't a thing (thankfully) and one of the big 
> > > > >> disadvantages
> > > > >> of the current validate_subreg mode-changing rules is that they 
> > > > >> aren't
> > > > >> transitive.  This can artificially require temporary pseudos for 
> > > > >> things
> > > > >> that could be expressed directly as a single subreg.
> > > > >
> > > > > And that's what the proposed patch does (add same-mode size integer 
> > > > > mode
> > > > > punning intermediate subregs).
> > > > >
> > > > > So if that's not supposed to be necessary then why restrict subregs 
> > > > > at all?
> > > >
> > > > I was trying to say: I'm not sure we should.
> > > >
> > > > > I mean you seem to imply that the semantics would be clear and 
> > > > > well-defined
> > > > > (to you - not to me).  The only thing is that of course not all 
> > > > > subregs are
> > > > > "implemented" by a target (or can be, w/o spilling).
> > > >
> > > > Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
> > > > But it only comes in to play during RA or when trying to take
> > > > the subreg of a particular hard register.  Transitivity doesn't
> > > > matter so much for the hard register case since the result of
> > > > simplify_gen_subreg should then be another hard register.
> > > >
> > > > > Which means - we should adjust validate_subreg with another 
> > > > > special-case
> > > > > or rather generalize the existing ones to an overall set that makes 
> > > > > more
> > > > > sense?
> > > >
> > > > Maybe it's too radical, but I would whether we should just get rid of:
> > > >
> > > >   /* ??? This should not be here.  Temporarily continue to allow 
> > > > word_mode
> > > >  subregs of anything.  The most common offender is (subreg:SI 
> > > > (reg:DF)).
> > > >  Generally, backends are doing something sketchy but it'll take 
> > > > time to
> > > >  fix them all.  */
> > > >   if (omode == word_mode)
> > > > ;
> > > >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though 
> > > > store_bit_field
> > > >  is the culprit here, and not the backends.  */
> > > >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > > ;
> > > >   /* Allow component subregs of complex and vector.  Though given the 
> > > > below
> > > >  extraction rules, it's not always clear what that means.  */
> > > >   else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> > > >&& GET_MODE_INNER (imode) == omode)
> > > > ;
> > > >   /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
> > > >  i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> > > >  surely isn't the cleanest way to represent this.  It's que

Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-30 Thread Hongtao Liu via Gcc-patches

On Tue, Aug 31, 2021 at 2:30 PM Hongtao Liu  wrote:
>
> On Tue, Aug 31, 2021 at 2:11 PM Richard Biener
>  wrote:
> >
> > On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu  wrote:
> > >
> > > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
> > >  wrote:
> > > >
> > > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
> > > >  wrote:
> > > > >
> > > > > Richard Biener via Gcc-patches  writes:
> > > > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > > > > >  wrote:
> > > > > >>
> > > > > >> Richard Biener via Gcc-patches  writes:
> > > > > >> > One thought I had is whether we can "fix" validate_subreg to 
> > > > > >> > have less
> > > > > >> > "weird" allowed float-int
> > > > > >> > special cases.  As said upthread I think that we either should 
> > > > > >> > allow
> > > > > >> > all of those, implying that
> > > > > >> > subregs work semantically as if there's subregs to same-sized 
> > > > > >> > integer
> > > > > >> > modes inbetween or
> > > > > >> > disallow them all and make sure we're actually doing that 
> > > > > >> > explicitely.
> > > > > >> >
> > > > > >> > For example
> > > > > >> >
> > > > > >> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though 
> > > > > >> > store_bit_field
> > > > > >> >  is the culprit here, and not the backends.  */
> > > > > >> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > > > >> > ;
> > > > > >> >
> > > > > >> > I can't decipther rtl.text as to what the semantics of such a 
> > > > > >> > subreg is
> > > > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> > > > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> > > > > >> > when you mix those in a subreg.  So maybe the above should
> > > > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> > > > > >> >
> > > > > >> > But then the world would be much simpler if subregs of non-same 
> > > > > >> > size
> > > > > >> > modes have explicit documentation for the mode kinds we have.
> > > > > >>
> > > > > >> Yeah.  Although validate_subreg was a good idea, some of the mode 
> > > > > >> checks
> > > > > >> are IMO a failed experiment.  The hope was that eventually we'd 
> > > > > >> remove
> > > > > >> all those special exceptions once the culprit has been fixed.  
> > > > > >> However,
> > > > > >> the code is over 16 years old at this point and those changes never
> > > > > >> happened.
> > > > > >>
> > > > > >> Nested subregs aren't a thing (thankfully) and one of the big 
> > > > > >> disadvantages
> > > > > >> of the current validate_subreg mode-changing rules is that they 
> > > > > >> aren't
> > > > > >> transitive.  This can artificially require temporary pseudos for 
> > > > > >> things
> > > > > >> that could be expressed directly as a single subreg.
> > > > > >
> > > > > > And that's what the proposed patch does (add same-mode size integer 
> > > > > > mode
> > > > > > punning intermediate subregs).
> > > > > >
> > > > > > So if that's not supposed to be necessary then why restrict subregs 
> > > > > > at all?
> > > > >
> > > > > I was trying to say: I'm not sure we should.
> > > > >
> > > > > > I mean you seem to imply that the semantics would be clear and 
> > > > > > well-defined
> > > > > > (to you - not to me).  The only thing is that of course not all 
> > > > > > subregs are
> > > > > > "implemented" by a target (or can be, w/o spilling).
> > > > >
> > > > > Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
> > > > > But it only comes in to play during RA or when trying to take
> > > > > the subreg of a particular hard register.  Transitivity doesn't
> > > > > matter so much for the hard register case since the result of
> > > > > simplify_gen_subreg should then be another hard register.
> > > > >
> > > > > > Which means - we should adjust validate_subreg with another 
> > > > > > special-case
> > > > > > or rather generalize the existing ones to an overall set that makes 
> > > > > > more
> > > > > > sense?
> > > > >
> > > > > Maybe it's too radical, but I would whether we should just get rid of:
> > > > >
> > > > >   /* ??? This should not be here.  Temporarily continue to allow 
> > > > > word_mode
> > > > >  subregs of anything.  The most common offender is (subreg:SI 
> > > > > (reg:DF)).
> > > > >  Generally, backends are doing something sketchy but it'll take 
> > > > > time to
> > > > >  fix them all.  */
> > > > >   if (omode == word_mode)
> > > > > ;
> > > > >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though 
> > > > > store_bit_field
> > > > >  is the culprit here, and not the backends.  */
> > > > >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > > > ;
> > > > >   /* Allow component subregs of complex and vector.  Though given the 
> > > > > below
> > > > >  extraction rules, it's not always clear what that means.  */
> > > > >   else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> > > > >&&

Re: [PATCH][pushed] Add -fprofile-reproducible=parallel-runs to STAGEfeedback_CFLAGS to Makefile.tpl.

2021-08-30 Thread Martin Liška


On 8/31/21 01:03, Gleb Fotengauer-Malinovskiy wrote:

On Mon, Aug 30, 2021 at 10:13:21AM +0200, Martin Liška wrote:

On 8/27/21 19:45, Gleb Fotengauer-Malinovskiy wrote:

Hi,

On Thu, Mar 11, 2021 at 04:19:51PM +0100, Martin Liška wrote:

Pushed as obvious, the original change was done
in g:e05a117dc4b98f3ac60851608f532ba7cee7343a.

Martin

ChangeLog:

* Makefile.tpl: The change was done Makefile.in which
is generated file.
---
Makefile.tpl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile.tpl b/Makefile.tpl
index 3b88f351d5b..6e0337fb48f 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -488,7 +488,7 @@ STAGEprofile_TFLAGS = $(STAGE2_TFLAGS)
STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS))
STAGEtrain_TFLAGS = $(filter-out -fchecking=1,$(STAGE3_TFLAGS))

-STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use

+STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use 
-fprofile-reproducible=parallel-runs


Did you mean to add -fprofile-reproducible flag to STAGEprofile_CFLAGS
instead?


No.


I suppose this flag doesn't mean anything without
-fprofile-generate, which is passed through STAGEprofile_CFLAGS.


The flag -fprofile-reproducible is used when we use a collected profile, so it 
works
*only* with -fprofile-use flag.


Thank you for the answer, now I see how it works.


Good.



I think that it means that the documentation is rather misleading about
this flag:

  '-fprofile-reproducible=[multithreaded|parallel-runs|serial]'
   Control level of reproducibility of profile gathered by
   '-fprofile-generate'. [...]


Well, feel free to send a patch that can improve the wording.

Cheers,
Martin



It convinced me that this flag changes the way the code is instrumented
for profile generation.  The rest of it convinced me even more, actually.


Thanks for heads up!
Cheers,
Martin



Thanks,

76 matches

Mail list logo