Re: [RFC v2] RISCV: Combine Pass Clobber Ops

2022-03-11 Thread Kito Cheng via Gcc-patches
Hi Patrick:

There is few direction in my mind:

1. Model the C extension right in riscv.md
2. Write peephole2 pattern.
3. Implement a RISC-V specific register renaming pass.

1. Model the C extension right in riscv.md

Currently we rely the GNU as to compress the instruction to C
extension, and actually we can model that in our md file,

Using addsi3 as example (I didn't handle the rv64 case carefully since
it's not matter to demo the idea):

(define_insn "addsi3"
 [(set (match_operand:SI  0 "register_operand" "=r,r,l,r")
   (plus:SI (match_operand:SI 1 "register_operand" " r,0,sp,r")
(match_operand:SI 2 "arith_operand"" r,r,sp_imm,I")))]
 ""
"@
 add %0, %1, %2
 c.add %0, %2
 c.addi4spn %0, %2
 addi %0, %1, %2
"
 [(set_attr "type" "arith")
  (set_attr "mode" "SI")])

That could give GCC more knowledge about the C-extension, I've done
that before, but not upstream, and I guess I lost those changes...
The reason why I didn't upstream before is because there seems to be
no code size reduction by this work.
I guess we should have more tweaks to make it more useful, but I never
found enough time to do that.

2. Write peephole2/split pattern.

OK, that's pretty straightforward, just write down what you did in the
combine pass (this patch), and let peephole pass to do that.


(define_peephole2
 [(set (match_operand:SI 0 "register_operand" "")
  (match_operator:SI 1 "any_operator_has_compressed_form"
  [(match_operand:SI 2 "register_operand" "")
   (match_operand:SI 3 "register_operand" "")]))
  (set (match_operand:SI 4 "register_operand" "")
   (match_operator:SI 5 "any_binary_operator"
   [(match_dup 0)
(match_operand:SI 6 "arith_operand" "")]))
 "TARGET_RVC && peep2_reg_dead_p (1, operands[2])
  && /* More check*/"
 [(set (match_dup 2) (match_op_dup 1 [(match_dup 2) (match_dup 3)]))
  (set (match_dup 4) (match_op_dup 5 [(match_dup 2) (match_dup 6)]))]
)

3. Implement a RISC-V specific register renaming pass.

Last approach I've done in my previous job, but never contribute back
(and I've no chance to access that now),
It has some code size reduction, but I don't remember the accurate
number of the results,
This approach can give you having better global view of whole register usage,
So I believe this approach can have better results than other approaches.

You can find lots of useful util functions in regrename.cc that could
be used to build a RISC-V specific register renaming pass.

On Fri, Mar 11, 2022 at 1:56 AM Patrick O'Neill  wrote:
>
> RISC-V's C-extension describes 2-byte instructions with special
> constraints. One of those constraints is that one of the sources/dest
> registers are equal (op will clobber one of it's operands). This patch
> adds support for combining simple sequences:
>
> r1 = r2 + r3 (4 bytes)
> r2 DEAD
> r4 = r1 + r5 (4 bytes)
> r1 DEAD
>
> Combine pass now generates:
>
> r2 = r2 + r3 (2 bytes)
> r4 = r2 + r5 (4 bytes)
> r2 DEAD
>
> This change results in a ~150 Byte decrease in the linux kernel's
> compiled size (text: 5327254 Bytes -> 5327102 Bytes).
>
> I added this enforcement during the combine pass since it looks at the
> cost of certian expressions and can rely on the target to tell the
> pass that clobber-ops are cheaper than regular ops.
>
> The main thing holding this RFC back is the combine pass's behavior for
> sequences like this:
> b = a << 5;
> c = b + 2;
>
> Normally the combine pass modifies the RTL to be:
> c = (a << 5) + 2
> before expanding it back to the original statement.
>
> With my changes, the RTL is prevented from being combined like that and
> instead results in RTL like this:
> c = 2
> which is clearly wrong.
>
> I think that the next step would be to figure out where this
> re-expansion takes place and implement the same-register constraint
> there. However, I'm opening the RFC for any input:
> 1. Are there better ways to enforce same-register constraints during the
>combine pass other than declaring the source/dest register to be the
>same in RTL? Specifically, I'm concerned that this addition may
>restrict subsequent RTL pass optimizations.
> 2. Are there other concerns with implementing source-dest constraints
>within the combine pass?
> 3. Any other thoughts/input you have is welcome!
>
> 2022-03-10 Patrick O'Neill 
>
> * combine.cc: Add register equality replacement.
> * riscv.cc (riscv_insn_cost): Add in order to tell combine pass
>   that clobber-ops are cheaper.
> * riscv.h: Add c extension argument macros.
>
> Signed-off-by: Patrick O'Neill 
> ---
> Changelog:
> v2:
>  - Fix whitespace
>  - Rearrange conditionals to break long lines
> ---
>  gcc/combine.cc| 78 +++
>  gcc/config/riscv/riscv.cc | 42 +
>  gcc/config/riscv/riscv.h  |  7 
>  3 files changed, 127 insertions(+)
>
> diff --git a/g

[PATCH] c : Changed warning message for -Wstrict-prototypes [PR92209]

2022-03-11 Thread Krishna Narayanan via Gcc-patches
Hello,
The following is a patch for the PR92209,which gives a warning when
the function prototype does not specify its argument type.In this
patch there has been a change in the warning message displayed for
-Wstrict-prototypes to specify its argument types.I have also added
the testcase for it.
Regtested on x86_64,OK for commit? Please do review it.

2022-03-11  Krishna Narayanan  

PR c/92209
gcc/c/
*c-decl.cc (start_function): Fixed the warning message for -Wstrict-prototypes.

gcc/testsuite/Changelog:
*gcc.dg/pr92209.c: New test
*gcc.dg/pr20368-1.c: Updated warning message

---
gcc/c/c-decl.cc | 4 ++--
gcc/testsuite/gcc.dg/pr20368-1.c | 2 +-
gcc/testsuite/gcc.dg/pr92209.c | 6 ++
3 files changed, 9 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/pr92209.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index c701f07be..1983ffb23 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -7858,7 +7858,7 @@ grokparms (struct c_arg_info *arg_info, bool funcdef_flag)
if (arg_types == NULL_TREE && !funcdef_flag
&& !in_system_header_at (input_location))
warning (OPT_Wstrict_prototypes,
- "function declaration isn%'t a prototype");
+ "a function prototype must specify the argument types");

if (arg_types == error_mark_node)
/* Don't set TYPE_ARG_TYPES in this case. */
@@ -9625,7 +9625,7 @@ start_function (struct c_declspecs *declspecs,
struct c_declarator *declarator,
&& !prototype_p (TREE_TYPE (decl1))
&& C_DECL_ISNT_PROTOTYPE (old_decl))
warning_at (loc, OPT_Wstrict_prototypes,
- "function declaration isn%'t a prototype");
+ "a function prototype must specify the argument types");
/* Optionally warn of any global def with no previous prototype. */
else if (warn_missing_prototypes
&& old_decl != error_mark_node
diff --git a/gcc/testsuite/gcc.dg/pr20368-1.c b/gcc/testsuite/gcc.dg/pr20368-1.c
index 4140397c1..4b4914aa6 100644
--- a/gcc/testsuite/gcc.dg/pr20368-1.c
+++ b/gcc/testsuite/gcc.dg/pr20368-1.c
@@ -6,7 +6,7 @@
extern __typeof (f) g; /* { dg-error "'f' undeclared here \\(not in a
function\\)" } */

int
-f (x) /* { dg-warning "function declaration isn't a prototype" } */
+f (x) /* { dg-warning "a function prototype must specify the argument
types" } */
float x;
{
}
diff --git a/gcc/testsuite/gcc.dg/pr92209.c b/gcc/testsuite/gcc.dg/pr92209.c
new file mode 100644
index 0..3fae57b49
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr92209.c
@@ -0,0 +1,6 @@
+/*pr92209*/
+/* { dg-do compile } */
+/* { dg-options "-Wstrict-prototypes" } */
+static int func_1(); /* { dg-warning " a function prototype must
specify the argument types" } */
+int func_1(int a)
+ { return a; }
\ No newline at end of file
--
2.25.1


Re: [PATCH v2] cse: avoid signed overflow in compute_const_anchors [PR 104843]

2022-03-11 Thread Richard Biener via Gcc-patches
On Thu, Mar 10, 2022 at 12:32 PM Xi Ruoyao  wrote:
>
> On Thu, 2022-03-10 at 09:01 +0100, Richard Biener wrote:
> > On Wed, Mar 9, 2022 at 5:12 PM Xi Ruoyao 
> > wrote:
> > >
> > > On Wed, 2022-03-09 at 15:55 +0100, Richard Biener wrote:
> > >
> > > > isn't it better to make targetm.const_anchor unsigned?
> > > > The & and ~ are not subject to overflow rules.
> > >
> > > It's not enough: if n is the minimum value of HOST_WIDE_INT and
> > > const_anchor = 0x8000 (the value for MIPS), we'll have a signed
> > > 0x7fff
> > > in *upper_base.  Then the next line, "*upper_offs = n -
> > > *upper_base;"
> > > will be a signed overflow again.
> > >
> > > How about the following?
> >
> > Hmm, so all this seems to be to round CST up and down to a multiple of
> > CONST_ANCHOR.
> > It works on CONST_INT only which is sign-extended, so if there is
> > overflow the resulting
> > anchor is broken as far as I can see.
>
> On MIPS addiu/daddiu do 2-complement addition, so the overflowed result
> is still usable.

The issue is that what the CONST_INT actually means depends on the
mode, an "overflow" to a positive number will eventually change what
is lower and what is the upper bound(?)

> > So instead of papering over this issue
> > the function should return false when n is negative since then
> > n & ~(targetm.const_anchor - 1) is also not n rounded down to a
> > multiple of const_anchor.
>
> This function does work for negative n, like:
>
> void g (int, int);
> void
> f (void)
> {
>   g(0x8123, 0x81240001);
> }
>
> It should produce:
>
> li  $4,-2128347136  # 0x8124
> daddiu  $5,$4,1
> daddiu  $4,$4,-1
> jal g
>
> But return false for negative n will cause regression for this case,
> producing:
>
> li  $5,-2128347136  # 0x8124
> li  $4,-2128412672  # 0x8123
> ori $5,$5,0x1
> ori $4,$4,0x
> jal g
>
> That being said, it indeed does not work for:
>
> void g (int, int);
> void f ()
> {
>   g (0x7fff, 0x8001);
> }
>
> It produces:
>
> li  $5,-2147483648  # 0x8000
> li  $4,2147418112   # 0x7fff
> daddiu  $5,$5,1
> ori $4,$4,0x
> jal g
>
> Should be:
>
> li  $5,-2147483648  # 0x8000
> daddiu  $5,$5,1
> addiu   $4,$5,-1

So maybe you can figure out a fix that makes it work for this case as well.

> > > -- >8 --
> > >
> > > With a non-zero const_anchor, the behavior of this function relied on
> > > signed overflow.
> > >
> > > gcc/
> > >
> > > PR rtl-optimization/104843
> > > * cse.cc (compute_const_anchors): Use unsigned HOST_WIDE_INT for
> > > n to perform overflow arithmetics safely.
> > > ---
> > >  gcc/cse.cc | 8 
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/gcc/cse.cc b/gcc/cse.cc
> > > index a18b599d324..052fa0c3490 100644
> > > --- a/gcc/cse.cc
> > > +++ b/gcc/cse.cc
> > > @@ -1169,12 +1169,12 @@ compute_const_anchors (rtx cst,
> > >HOST_WIDE_INT *lower_base, HOST_WIDE_INT 
> > > *lower_offs,
> > >HOST_WIDE_INT *upper_base, HOST_WIDE_INT 
> > > *upper_offs)
> > >  {
> > > -  HOST_WIDE_INT n = INTVAL (cst);
> > > -
> > > -  *lower_base = n & ~(targetm.const_anchor - 1);
> > > -  if (*lower_base == n)
> > > +  unsigned HOST_WIDE_INT n = UINTVAL (cst);
> > > +  unsigned HOST_WIDE_INT lb = n & ~(targetm.const_anchor - 1);
> > > +  if (lb == n)
> > >  return false;
> > >
> > > +  *lower_base = lb;
> > >*upper_base =
> > >  (n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1);
> > >*upper_offs = n - *upper_base;
> > > --
> > > 2.35.1
> > >
> > >
> > > >
>
> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


Re: [PATCH] PR middle-end/98420: Don't fold x - x to 0.0 with -frounding-math

2022-03-11 Thread Richard Biener via Gcc-patches
On Fri, Mar 11, 2022 at 12:31 AM Roger Sayle  wrote:
>
>
> This patch addresses PR middle-end/98420, which is inappropriate constant
> folding of x - x to 0.0 (in match.pd) when -frounding-math is specified.
> Specifically, x - x may be -0.0 with FE_DOWNWARD as the rounding mode.
>
> To summarize, the desired IEEE behaviour, x - x for floating point x,
> (1) can't be folded to 0.0 by default, due to the possibility of NaN or Inf
> (2) can be folded to 0.0 with -ffinite-math-only
> (3) can't be folded to 0.0 with -ffinite-math-only -frounding-math
> (4) can be folded with -ffinite-math-only -frounding-math -fno-signed-zeros
>
> Technically, this is a regression from GCC 4.1 (according to godbolt.org)
> so hopefully this patch is suitable during stage4.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?

+ && !tree_expr_maybe_infinite_p (@0)
+ && (!flag_rounding_math || !HONOR_SIGNED_ZEROS (type
   { build_zero_cst (type); }))

HONOR_SIGN_DEPENDENT_ROUNDING (type) instead of flag_rounding_math?

OK with that change.

Richard.

>
> 2022-03-10  Roger Sayle  
>
> gcc/ChangeLog
> PR middle-end/98420
> * match.pd (minus @0 @0): Additional checks for -fno-rounding-math
> (the defaut) or -fno-signed-zeros.
>
> gcc/testsuite/ChangeLog
> PR middle-end/98420
> * gcc.dg/pr98420.c: New test case.
>
>
> Thanks in advance,
> Roger
> --
>


Re: [PATCH v2] PR tree-optimization/98335: Improvements to DSE's compute_trims.

2022-03-11 Thread Richard Biener via Gcc-patches
On Wed, Mar 9, 2022 at 6:43 PM Roger Sayle  wrote:
>
>
> Hi Richard,
> Many thanks.  Yes, your proposed ao_ref_alignment is exactly what I was 
> looking for.
> Here's the second revision of my patch for PR tree-optimization/98335 that 
> both uses/
> introduces ao_ref_alignment and more intelligently aligns/trims both head and 
> tail,
> for example handling the case discussed by Richard and Jeff Law, of a 16 
> byte-aligned
> object where we wish to avoid trimming (just) the last three bytes.  It uses 
> the useful
> property that writing N consecutive bytes, typically requires popcount(N) 
> store
> instructions, so we wish to align (if we can) that we begin/end with a store 
> of N' bytes
> where popcount(N') is one, if that isn't already the case.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and
> make -k check with no new failures.  Is this revised version ok for mainline?

OK.

I wonder if we can also handle bitpos != 0, like if we have
a an access 3 bytes into an 8 byte aligned object and
want to trim 2 bytes at start then it would be good to trim only
1 byte so the access is then 4 byte aligned (4 bytes into the
8 byte aligned object).  Similar, trimming 6 bytes should be reduced
to trimming 5 bytes.

Richard.

> 2022-03-09  Roger Sayle  
> Richard Biener  
>
> gcc/ChangeLog
> PR tree-optimization/98335
> * builtins.cc (get_object_alignment_2): Export.
> * builtins.h (get_object_alignment_2): Likewise.
> * tree-ssa-alias.cc (ao_ref_alignment): New.
> * tree-ssa-alias.h (ao_ref_alignment): Declare.
>
> * tree-ssa-dse.cc (compute_trims): Improve logic deciding whether
> to align head/tail, writing more bytes but using fewer store insns.
> (maybe_trim_memstar_call): Silence compiler warnings by using
> memset to initialize lendata.
>
> gcc/testsuite/ChangeLog
> PR tree-optimization/98335
> * g++.dg/pr98335.C: New test case.
> * gcc.dg/pr86010.c: New test case.
> * gcc.dg/pr86010-2.c: New test case.
>
> Thanks again for your help.
> Roger
> --
>
> > -Original Message-
> > From: Richard Biener 
> > Sent: 08 March 2022 10:44
> > To: Roger Sayle 
> > Cc: GCC Patches 
> > Subject: Re: [PATCH] PR tree-optimization/98335: Improvements to DSE's
> > compute_trims.
> >
> > On Tue, Mar 8, 2022 at 11:10 AM Richard Biener
> >  wrote:
> > >
> > > On Mon, Mar 7, 2022 at 11:04 AM Roger Sayle
> >  wrote:
> > > >
> > > >
> > > > This patch is the main middle-end piece of a fix for PR
> > > > tree-opt/98335, which is a code-quality regression affecting
> > > > mainline.  The issue occurs in DSE's (dead store elimination's)
> > > > compute_trims function that determines where a store to memory can
> > > > be trimmed.  In the testcase given in the PR, this function notices
> > > > that the first byte of a DImode store is dead, and replaces the
> > > > 8-byte store at (aligned) offset zero, with a 7-byte store at
> > > > (unaligned) offset one.  Most architectures can store a power-of-two
> > > > bytes (up to a maximum) in single instruction, so writing 7 bytes
> > > > requires more instructions than writing 8 bytes.  This patch follows 
> > > > Jakub
> > Jelinek's suggestion in comment 5, that compute_trims needs improved
> > heuristics.
> > > >
> > > > In this patch, decision of whether and how to align trim_head is
> > > > based on the number of bytes being written, the alignment of the
> > > > start of the object and where within the object the first byte is
> > > > written.  The first tests check whether we're already writing to the
> > > > start of the object, and that we're writing three or more bytes.  If
> > > > we're only writing one or two bytes, there's no benefit from providing
> > additional alignment.
> > > > Then we determine the alignment of the object, which is either 1, 2,
> > > > 4, 8 or 16 byte aligned (capping at 16 guarantees that we never
> > > > write more than 7 bytes beyond the minimum required).  If the buffer
> > > > is only
> > > > 1 or 2 byte aligned there's no benefit from additional alignment.
> > > > For the remaining cases, alignment of trim_head is based upon where
> > > > within each aligned block (word) the first byte is written.  For
> > > > example, storing the last byte (or last half-word) of a word can be
> > > > performed with a single insn.
> > > >
> > > > On x86_64-pc-linux-gnu with -O2 the new test case in the PR goes from:
> > > >
> > > > movl$0, -24(%rsp)
> > > > movabsq $72057594037927935, %rdx
> > > > movl$0, -21(%rsp)
> > > > andq-24(%rsp), %rdx
> > > > movq%rdx, %rax
> > > > salq$8, %rax
> > > > movbc(%rip), %al
> > > > ret
> > > >
> > > > to
> > > >
> > > > xorl%eax, %eax
> > > > movbc(%rip), %al
> > > > ret
> > > >
> > > > This patch has been tested on x86_64-pc-linux-gnu with make
> > > > bootstrap a

[Patch] lto-plugin: Honor link_output_name for -foffload-objects file name

2022-03-11 Thread Tobias Burnus

This patch removes the last(?) -save-temps file that is still written to /tmp.

Thus, instead of

.../12.0.1/lto-wrapper -fresolution=a.res -flinker-output=exec 
-foffload-objects=/tmp/ccyXiCap.ofldlist
.../12.0.1/lto-wrapper -fresolution=a.res -flinker-output=exec 
-foffload-objects=/tmp/ccyXiCap.ofldlist
[Leaving LTRANS /tmp/ccyXiCap.ofldlist]

the result is now

.../12.0.1/lto-wrapper -fresolution=a.res -flinker-output=exec 
-foffload-objects=a.ofldlist
.../12.0.1/lto-wrapper -fresolution=a.res -flinker-output=exec 
-foffload-objects=a.ofldlist
[Leaving LTRANS a.ofldlist]


OK for mainline? (Stage1?)

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
lto-plugin: Honor link_output_name for -foffload-objects file name

lto-plugin/ChangeLog:

	* lto-plugin.c (all_symbols_read_handler): With -save-temps, use
	link_output_name for -foffload-objects's file name, if available.

 lto-plugin/lto-plugin.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
index 593fbc91383..33d49571d0e 100644
--- a/lto-plugin/lto-plugin.c
+++ b/lto-plugin/lto-plugin.c
@@ -799,8 +799,15 @@ all_symbols_read_handler (void)
   char *arg;
   char *offload_objects_file_name;
   struct plugin_offload_file *ofld;
+  const char *suffix = ".ofldlist";
 
-  offload_objects_file_name = make_temp_file (".ofldlist");
+  if (save_temps && link_output_name)
+	{
+	  suffix += skip_in_suffix;
+	  offload_objects_file_name = concat (link_output_name, suffix, NULL);
+	}
+  else
+	offload_objects_file_name = make_temp_file (suffix);
   check (offload_objects_file_name, LDPL_FATAL,
 	 "Failed to generate a temporary file name");
   f = fopen (offload_objects_file_name, "w");


Re: [Patch] lto-plugin: Honor link_output_name for -foffload-objects file name

2022-03-11 Thread Richard Biener via Gcc-patches
On Fri, Mar 11, 2022 at 12:44 PM Tobias Burnus  wrote:
>
> This patch removes the last(?) -save-temps file that is still written to /tmp.
>
> Thus, instead of
>
> .../12.0.1/lto-wrapper -fresolution=a.res -flinker-output=exec 
> -foffload-objects=/tmp/ccyXiCap.ofldlist
> .../12.0.1/lto-wrapper -fresolution=a.res -flinker-output=exec 
> -foffload-objects=/tmp/ccyXiCap.ofldlist
> [Leaving LTRANS /tmp/ccyXiCap.ofldlist]
>
> the result is now
>
> .../12.0.1/lto-wrapper -fresolution=a.res -flinker-output=exec 
> -foffload-objects=a.ofldlist
> .../12.0.1/lto-wrapper -fresolution=a.res -flinker-output=exec 
> -foffload-objects=a.ofldlist
> [Leaving LTRANS a.ofldlist]
>
>
> OK for mainline? (Stage1?)

Ok for trunk.

>
> Tobias
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


[Patch] OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

2022-03-11 Thread Marcel Vollweiler

Hi,

This patch adds the OpenMP runtime routine "omp_target_is_accessible" which was
introduced in OpenMP 5.1 (specification section 3.8.4):

"The omp_target_is_accessible routine tests whether host memory is accessible
from a given device."

"This routine returns true if the storage of size bytes starting at the address
given by ptr is accessible from device device_num. Otherwise, it returns false."

"The value of ptr must be a valid host pointer or NULL (or C_NULL_PTR, for
Fortran). The device_num argument must be greater than or equal to zero and less
than or equal to the result of omp_get_num_devices()."

"When called from within a target region the effect is unspecified."

Currently, the only way of accessing host memory on a non-host device is via
shared memory. This will change with unified shared memory (usm) that was
recently submitted but not yet approved/committed. A follow-up patch for
omp_target_is_accessible is planned considering usm when available. The current
patch handles the basic implementation for C/C++ and Fortran and includes
comments pointing to usm.

Although not explicitly specified in the OpenMP 5.1 standard, the implemented
function returns "true" if the given device_num is equal to
"omp_get_num_devices" (i.e. the host) as it is expected that host memory can be
accessed from the host device.

The patch was tested on x86_64-linux and PowerPC, both with nvptx offloading.
All with no regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

gcc/ChangeLog:

* omp-low.cc (omp_runtime_api_call): Added target_is_accessible to
omp_runtime_apis array.

libgomp/ChangeLog:

* libgomp.map: Added omp_target_is_accessible.
* libgomp.texi: Tagged omp_target_is_accessible as supported.
* omp.h.in: Added omp_target_is_accessible.
* omp_lib.f90.in: Added interface for omp_target_is_accessible.
* omp_lib.h.in: Likewise.
* target.c (omp_target_is_accessible): Added implementation of
omp_target_is_accessible.
* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: New test.
* testsuite/libgomp.fortran/target-is-accessible-1.f90: New test.

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 77176ef..bf38fad 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -3959,6 +3959,7 @@ omp_runtime_api_call (const_tree fndecl)
   "target_associate_ptr",
   "target_disassociate_ptr",
   "target_free",
+  "target_is_accessible",
   "target_is_present",
   "target_memcpy",
   "target_memcpy_rect",
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 2ac5809..1764380 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -226,6 +226,11 @@ OMP_5.1 {
omp_get_teams_thread_limit_;
 } OMP_5.0.2;
 
+OMP_5.1.1 {
+  global:
+   omp_target_is_accessible;
+} OMP_5.1;
+
 GOMP_1.0 {
   global:
GOMP_atomic_end;
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 161a423..58e432c 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -311,7 +311,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
   @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
   routines @tab Y @tab
-@item @code{omp_target_is_accessible} runtime routine @tab N @tab
+@item @code{omp_target_is_accessible} runtime routine @tab Y @tab
 @item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
   runtime routines @tab N @tab
 @item @code{omp_get_mapped_ptr} runtime routine @tab N @tab
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index 89c5d65..1ec7415 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -282,6 +282,8 @@ extern int omp_target_memcpy_rect (void *, const void *, 
__SIZE_TYPE__, int,
 extern int omp_target_associate_ptr (const void *, const void *, __SIZE_TYPE__,
 __SIZE_TYPE__, int) __GOMP_NOTHROW;
 extern int omp_target_disassociate_ptr (const void *, int) __GOMP_NOTHROW;
+extern int omp_target_is_accessible (const void *, __SIZE_TYPE__, int)
+  __GOMP_NOTHROW;
 
 extern void omp_set_affinity_format (const char *) __GOMP_NOTHROW;
 extern __SIZE_TYPE__ omp_get_affinity_format (char *, __SIZE_TYPE__)
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index daf40dc..f369507 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -835,6 +835,16 @@
   end function omp_target_disassociate_ptr
 end interface
 
+interface
+  function omp_target_is_accessible (ptr, size, device_num) bind(c)
+use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
+inte

[PATCH] target/104762 - vectorization costs of CONSTRUCTORs

2022-03-11 Thread Richard Biener via Gcc-patches
After accounting for GPR -> XMM move cost for vec_construct the
base cost needs adjustments to not double-cost those.  This also
lowers the cost when such move is not necessary.

This fixes the observed 538.imagick_r and 525.x264_r regressions
for me on Zen2 with -Ofast -march=native.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK for trunk?

Thanks,
Richard.

2022-03-11  Richard Biener  

PR target/104762
* config/i386/i386.cc (ix86_builtin_vectorization_cost): Do not
cost the first lane of SSE pieces as inserts for vec_construct.
---
 gcc/config/i386/i386.cc | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 4121f986221..23bedea92bd 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -22597,16 +22597,21 @@ ix86_builtin_vectorization_cost (enum 
vect_cost_for_stmt type_of_cost,
 
   case vec_construct:
{
- /* N element inserts into SSE vectors.  */
- int cost = TYPE_VECTOR_SUBPARTS (vectype) * ix86_cost->sse_op;
+ int n = TYPE_VECTOR_SUBPARTS (vectype);
+ /* N - 1 element inserts into an SSE vector, the possible
+GPR -> XMM move is accounted for in add_stmt_cost.  */
+ if (GET_MODE_BITSIZE (mode) <= 128)
+   return (n - 1) * ix86_cost->sse_op;
  /* One vinserti128 for combining two SSE vectors for AVX256.  */
- if (GET_MODE_BITSIZE (mode) == 256)
-   cost += ix86_vec_cost (mode, ix86_cost->addss);
+ else if (GET_MODE_BITSIZE (mode) == 256)
+   return ((n - 2) * ix86_cost->sse_op
+   + ix86_vec_cost (mode, ix86_cost->addss));
  /* One vinserti64x4 and two vinserti128 for combining SSE
 and AVX256 vectors to AVX512.  */
  else if (GET_MODE_BITSIZE (mode) == 512)
-   cost += 3 * ix86_vec_cost (mode, ix86_cost->addss);
- return cost;
+   return ((n - 4) * ix86_cost->sse_op
+   + 3 * ix86_vec_cost (mode, ix86_cost->addss));
+ gcc_unreachable ();
}
 
   default:
-- 
2.34.1


Re: [PATCH] target/104762 - vectorization costs of CONSTRUCTORs

2022-03-11 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 11, 2022 at 8:43 PM Richard Biener via Gcc-patches
 wrote:
>
> After accounting for GPR -> XMM move cost for vec_construct the
> base cost needs adjustments to not double-cost those.  This also
> lowers the cost when such move is not necessary.
>
> This fixes the observed 538.imagick_r and 525.x264_r regressions
> for me on Zen2 with -Ofast -march=native.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> OK for trunk?
LGTM.
>
> Thanks,
> Richard.
>
> 2022-03-11  Richard Biener  
>
> PR target/104762
> * config/i386/i386.cc (ix86_builtin_vectorization_cost): Do not
> cost the first lane of SSE pieces as inserts for vec_construct.
> ---
>  gcc/config/i386/i386.cc | 17 +++--
>  1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 4121f986221..23bedea92bd 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -22597,16 +22597,21 @@ ix86_builtin_vectorization_cost (enum 
> vect_cost_for_stmt type_of_cost,
>
>case vec_construct:
> {
> - /* N element inserts into SSE vectors.  */
> - int cost = TYPE_VECTOR_SUBPARTS (vectype) * ix86_cost->sse_op;
> + int n = TYPE_VECTOR_SUBPARTS (vectype);
> + /* N - 1 element inserts into an SSE vector, the possible
> +GPR -> XMM move is accounted for in add_stmt_cost.  */
> + if (GET_MODE_BITSIZE (mode) <= 128)
> +   return (n - 1) * ix86_cost->sse_op;
>   /* One vinserti128 for combining two SSE vectors for AVX256.  */
> - if (GET_MODE_BITSIZE (mode) == 256)
> -   cost += ix86_vec_cost (mode, ix86_cost->addss);
> + else if (GET_MODE_BITSIZE (mode) == 256)
> +   return ((n - 2) * ix86_cost->sse_op
> +   + ix86_vec_cost (mode, ix86_cost->addss));
>   /* One vinserti64x4 and two vinserti128 for combining SSE
>  and AVX256 vectors to AVX512.  */
>   else if (GET_MODE_BITSIZE (mode) == 512)
> -   cost += 3 * ix86_vec_cost (mode, ix86_cost->addss);
> - return cost;
> +   return ((n - 4) * ix86_cost->sse_op
> +   + 3 * ix86_vec_cost (mode, ix86_cost->addss));
> + gcc_unreachable ();
> }
>
>default:
> --
> 2.34.1



-- 
BR,
Hongtao


[PATCH] tree-optimization/104880 - update-address-taken and cmpxchg

2022-03-11 Thread Richard Biener via Gcc-patches
The following addresses optimistic non-addressable marking of
an argument of __atomic_compare_exchange_n which broke when
I added DECL_NOT_GIMPLE_REG_P since we cannot guarantee we can
rewrite it when TREE_ADDRESSABLE is unset.  Instead we have to
restore TREE_ADDRESSABLE in that case.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

2022-03-11  Richard Biener  

PR tree-optimization/104880
* tree-ssa.cc (execute_update_address_taken): Remember if we
optimistically made something not addressable and
prepare to undo it.

* g++.dg/opt/pr104880.cc: New testcase.
---
 gcc/testsuite/g++.dg/opt/pr104880.cc | 43 
 gcc/tree-ssa.cc  | 16 +--
 2 files changed, 56 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/pr104880.cc

diff --git a/gcc/testsuite/g++.dg/opt/pr104880.cc 
b/gcc/testsuite/g++.dg/opt/pr104880.cc
new file mode 100644
index 000..de56a5acfd4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr104880.cc
@@ -0,0 +1,43 @@
+// { dg-do compile }
+// { dg-options "-O2 -Wno-pmf-conversions -fno-checking" }
+
+class c {
+  long b;
+};
+class B {
+public:
+  typedef void *d;
+};
+class aa {
+public:
+  aa(B::d);
+};
+class e : public B {
+public:
+  e();
+};
+unsigned int f;
+struct g {
+  struct h : c {
+h(unsigned int &i) : c(reinterpret_cast(i)) {}
+unsigned int ad();
+  };
+};
+class n : g {
+public:
+  n(int);
+  void j() {
+unsigned int a;
+h k(a);
+__atomic_compare_exchange_n(&f, &a, k.ad(), true, 3, 0);
+  }
+};
+int l;
+class m : e {
+  void ar() {
+n b(l);
+b.j();
+  }
+  virtual void bd() { aa(d(&m::ar)); }
+};
+void o() { new m; }
diff --git a/gcc/tree-ssa.cc b/gcc/tree-ssa.cc
index 423dd871d9e..6dcb3142869 100644
--- a/gcc/tree-ssa.cc
+++ b/gcc/tree-ssa.cc
@@ -1742,6 +1742,7 @@ execute_update_addresses_taken (void)
   auto_bitmap addresses_taken;
   auto_bitmap not_reg_needs;
   auto_bitmap suitable_for_renaming;
+  bool optimistic_not_addressable = false;
   tree var;
   unsigned i;
 
@@ -1770,6 +1771,8 @@ execute_update_addresses_taken (void)
  gimple_call_set_arg (stmt, 1, null_pointer_node);
  gimple_ior_addresses_taken (addresses_taken, stmt);
  gimple_call_set_arg (stmt, 1, arg);
+ /* Remember we have to check again below.  */
+ optimistic_not_addressable = true;
}
  else if (is_asan_mark_p (stmt)
   || gimple_call_internal_p (stmt, IFN_GOMP_SIMT_ENTER))
@@ -1873,7 +1876,8 @@ execute_update_addresses_taken (void)
 
   /* Operand caches need to be recomputed for operands referencing the updated
  variables and operands need to be rewritten to expose bare symbols.  */
-  if (!bitmap_empty_p (suitable_for_renaming))
+  if (!bitmap_empty_p (suitable_for_renaming)
+  || optimistic_not_addressable)
 {
   FOR_EACH_BB_FN (bb, cfun)
for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
@@ -2064,12 +2068,18 @@ execute_update_addresses_taken (void)
if (optimize_atomic_compare_exchange_p (stmt))
  {
tree expected = gimple_call_arg (stmt, 1);
-   if (bitmap_bit_p (suitable_for_renaming,
- DECL_UID (TREE_OPERAND (expected, 0
+   tree decl = TREE_OPERAND (expected, 0);
+   if (bitmap_bit_p (suitable_for_renaming, DECL_UID (decl)))
  {
fold_builtin_atomic_compare_exchange (&gsi);
continue;
  }
+   else if (!TREE_ADDRESSABLE (decl))
+ /* If there are partial defs of the decl we may
+have cleared the addressable bit but set
+DECL_NOT_GIMPLE_REG_P.  We have to restore
+TREE_ADDRESSABLE here.  */
+ TREE_ADDRESSABLE (decl) = 1;
  }
else if (is_asan_mark_p (stmt))
  {
-- 
2.34.1


Re: [Patch] OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

2022-03-11 Thread Tobias Burnus

Minor remark to the test:

On 11.03.22 13:30, Marcel Vollweiler wrote:

+  int d = omp_get_default_device ();

...

+  int shared_mem = 0;
+  #pragma omp target map (alloc: shared_mem) device (d)
+shared_mem = 1;
+  if (omp_target_is_accessible (p, sizeof (int), d) != shared_mem)
+__builtin_abort ();


I wonder whether it makes sense to do instead
  for (d = 0; d <= omp_get_num_devices(); ++d)
instead of just
  d = omp_get_default_device();
given that we have already found once in a while bugs when testing more
than just the default device - be it because devices differed or because
'0' was special.

In particular, I could image having at the same time two or three devices
available of type intelmic + gcn + nvptx, possibly mixing shared memory,
nonshared memory and semi-shared memory*

Tobias

(* semi-shared: I am especially thinking of nvptx with %dynamic_smem_size,
which requires some special handling. By contrast with HMM and Pascal GPUs,
real USM is possible.)

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [RFC v2] RISCV: Combine Pass Clobber Ops

2022-03-11 Thread Segher Boessenkool
Hi!

On Thu, Mar 10, 2022 at 09:54:55AM -0800, Patrick O'Neill wrote:
> I added this enforcement during the combine pass since it looks at the
> cost of certian expressions and can rely on the target to tell the
> pass that clobber-ops are cheaper than regular ops.

That is not a reason to put target stuff in generic code.  If you need
a new hook, you should create one.

>   * combine.cc: Add register equality replacement.

Please Cc: the maintainers of any code you want to be looked at.

This is not suitable for stage 4.  Please try again in stage 1.

>   * riscv.cc (riscv_insn_cost): Add in order to tell combine pass
> that clobber-ops are cheaper.

(formatting)

> +/* Attempt to replace ops with clobber-ops.
> + If the target implements clobber ops (set r1 (plus (r1)(r2))) as 
> cheaper,

(formatting)

> +  if (!i0 && !i1 && i2 && i3 && GET_CODE(PATTERN(i2)) == SET

Space before opening paren (in essentially all cases).

> +  && GET_CODE(SET_DEST(PATTERN(i2))) == REG

REG_P

> + // Now we have a dead operand register, and we know where the dest dies.

Don't mix comment styles.

> + // Remove the note declaring the register as dead

Sentences end with a full stop.

> + // Overwrite i2 dest with operand1
> + rtx i2_dest = copy_rtx(operand1);
> + SUBST (SET_DEST (PATTERN (i2)), i2_dest);

That comment only confuses matters?

> + // Move the dest dead note to the new register
> + note = find_reg_note (i3, REG_DEAD, prior_reg);
> + if (note) {
> +   remove_note (i3, note);
> +   //add_reg_note (i3, REG_DEAD, op1_copy);
> + }

Please don't submit unfinished code.  If there is any reason to comment
out code, it needs a comment itself.

> +static int
> +riscv_insn_cost (rtx_insn *insn, bool speed)
> +{
> +  rtx pat = PATTERN (insn);
> +
> +  if (TARGET_RVC && !speed) {

Opening curlies are on a new line, indented:
  if (a)
{
  blablabla ();
  b0rk ();
}

> +if (GET_CODE(pat) == SET && GET_CODE(SET_DEST(pat)) == REG) {
> +  rtx src = SET_SRC(pat);
> +  rtx dest = SET_DEST(pat);
> +  if (GET_CODE(src) == PLUS && GET_CODE(XEXP(src, 0)) == REG && 
> REGNO(XEXP(src, 0)) == REGNO(dest)) {

Line length is 80 chars maximum.

Comparing REGNOs isn't likely correct.  There can be pseudos here, but
also hard registers.  You need to consider both cases.  The code may
well be correct, but as written it isn't obvious at all.

> + if (GET_CODE(XEXP(src, 1)) == REG)
> +   return 2;
> + else if (GET_CODE(XEXP(src, 1)) == CONST_INT && 
> CMPRESD_OPERAND(INTVAL(XEXP(src, 1
> +   return 2;

REG_P, CONST_INT_P

> +#define CMPRESD_OPERAND(VALUE) \
> +  (VALUE < 32 && VALUE >= -32)

IN_RANGE ((VALUE), -32, 31)

Using a predicate might be better?  Note you need parens around macro
params, btw.

> +/* True if VALUE is an unsigned 5-bit number. */
> +#define UNSIGNED_CMPRESD_OPERAND(VALUE) \
> +  (VALUE < 64 && VALUE >= 0)

IN_RANGE ((VALUE), 0, 63)


Segher


Re: [committed] libstdc++: Move closing brace outside #endif [PR104866]

2022-03-11 Thread Detlef Vollmann

Hi Jonathan,

On 3/10/22 22:11, Jonathan Wakely wrote:


Tested x86_64-linux, pushed to trunk.


Thanks.
With this and the other fix I was able to build the complete
libstdc++ for AVR based on AVR Libc 2.0 (plus some ad hoc
AVR header fixes) from git master 5e28be89.

And a small example with pmr::string and pmr::vector worked :-)

Thanks again,
  Detlef


Re: [committed] libstdc++: Move closing brace outside #endif [PR104866]

2022-03-11 Thread Jonathan Wakely via Gcc-patches
On Fri, 11 Mar 2022 at 14:28, Detlef Vollmann wrote:
>
> Hi Jonathan,
>
> On 3/10/22 22:11, Jonathan Wakely wrote:
>
> > Tested x86_64-linux, pushed to trunk.
>
> Thanks.
> With this and the other fix I was able to build the complete
> libstdc++ for AVR based on AVR Libc 2.0 (plus some ad hoc
> AVR header fixes) from git master 5e28be89.

Nice. I finally figure out that I need to use --enable-libstdcxx *not*
--enable-libstdc++-v3 to build for AVR, and now I get errors due to
EOVERFLOW being undefined. Is that what you fixed?

We should make that work. Arguably, all values of std::errc should
exist, even if the OS  doesn't provide a constant. We could
define the missing ones ourselves, choosing numbers > 1000 (and hope
the OS uses small numbers for its own errno values).

>
> And a small example with pmr::string and pmr::vector worked :-)

Great!



Re: [committed] libstdc++: Move closing brace outside #endif [PR104866]

2022-03-11 Thread Jonathan Wakely via Gcc-patches
On Fri, 11 Mar 2022 at 15:12, Jonathan Wakely wrote:
>
> On Fri, 11 Mar 2022 at 14:28, Detlef Vollmann wrote:
> >
> > Hi Jonathan,
> >
> > On 3/10/22 22:11, Jonathan Wakely wrote:
> >
> > > Tested x86_64-linux, pushed to trunk.
> >
> > Thanks.
> > With this and the other fix I was able to build the complete
> > libstdc++ for AVR based on AVR Libc 2.0 (plus some ad hoc
> > AVR header fixes) from git master 5e28be89.
>
> Nice. I finally figure out that I need to use --enable-libstdcxx *not*
> --enable-libstdc++-v3 to build for AVR, and now I get errors due to
> EOVERFLOW being undefined. Is that what you fixed?
>
> We should make that work. Arguably, all values of std::errc should
> exist, even if the OS  doesn't provide a constant. We could
> define the missing ones ourselves, choosing numbers > 1000 (and hope
> the OS uses small numbers for its own errno values).

I opened https://gcc.gnu.org/PR104883 for this.


>
> >
> > And a small example with pmr::string and pmr::vector worked :-)
>
> Great!



Re: [committed] libstdc++: Move closing brace outside #endif [PR104866]

2022-03-11 Thread Detlef Vollmann

On 3/11/22 16:12, Jonathan Wakely wrote:

On Fri, 11 Mar 2022 at 14:28, Detlef Vollmann wrote:

With this and the other fix I was able to build the complete
libstdc++ for AVR based on AVR Libc 2.0 (plus some ad hoc
AVR header fixes) from git master 5e28be89.


Nice. I finally figure out that I need to use --enable-libstdcxx *not*
--enable-libstdc++-v3 to build for AVR,

Yes, I had the same problem.
A comment in the 'configure' script still says libstdc++-v3
and in the configure docs at

or

there's neither :-(


and now I get errors due to
EOVERFLOW being undefined. Is that what you fixed?

One of them.


We should make that work. Arguably, all values of std::errc should
exist, even if the OS  doesn't provide a constant. We could
define the missing ones ourselves, choosing numbers > 1000 (and hope
the OS uses small numbers for its own errno values).

I simply defined all that were required.

I've attached a tarball with all my header fixes.
To get them picked up while compiling libstdc++ I had to
put them into ${prefix}/avr/lib/include.

But these are really ad hoc, some of the problems I think should
be fixed in the libstdc++ sources.
E.g. I think it's wrong to expect that specific functions are
available if a respective header is available (e.g. close()
in unistd.h).

  Detlef

avr-fixups.tar.bz2
Description: Binary data


Re: [PATCH] c++: Fix ICE with non-constant satisfaction [PR98644]

2022-03-11 Thread Patrick Palka via Gcc-patches
On Thu, 10 Mar 2022, Jason Merrill wrote:

> On 3/1/22 00:10, Patrick Palka wrote:
> > On Tue, 19 Jan 2021, Jason Merrill wrote:
> > 
> > > On 1/13/21 12:05 PM, Patrick Palka wrote:
> > > > In the below testcase, the expression of the atomic constraint after
> > > > substitution is (int *) NON_LVALUE_EXPR <1> != 0B which is not a C++
> > > > constant expression, but its TREE_CONSTANT flag is set (from build2),
> > > > so satisfy_atom fails to notice that it's non-constant (and we end
> > > > up tripping over the assert in satisfaction_value).
> > > > 
> > > > Since TREE_CONSTANT doesn't necessarily correspond to C++ constantness,
> > > > this patch makes satisfy_atom instead check
> > > > is_rvalue_constant_expression.
> > > > 
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > > > trunk/10?
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > PR c++/98644
> > > > * constraint.cc (satisfy_atom): Check 
> > > > is_rvalue_constant_expression
> > > > instead of TREE_CONSTANT.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > PR c++/98644
> > > > * g++.dg/cpp2a/concepts-pr98644.C: New test.
> > > > ---
> > > >gcc/cp/constraint.cc  | 2 +-
> > > >gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C | 7 +++
> > > >2 files changed, 8 insertions(+), 1 deletion(-)
> > > >create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C
> > > > 
> > > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > > > index 9049d087859..f99a25dc8a4 100644
> > > > --- a/gcc/cp/constraint.cc
> > > > +++ b/gcc/cp/constraint.cc
> > > > @@ -2969,7 +2969,7 @@ satisfy_atom (tree t, tree args, sat_info info)
> > > >{
> > > >  result = maybe_constant_value (result, NULL_TREE,
> > > >  /*manifestly_const_eval=*/true);
> > > > -  if (!TREE_CONSTANT (result))
> > > 
> > > This should be sufficient.  If the result isn't constant,
> > > maybe_constant_value
> > > shouldn't return it with TREE_CONSTANT set.  See
> > > 
> > > >/* This isn't actually constant, so unset TREE_CONSTANT.
> > > 
> > > in cxx_eval_outermost_constant_expr.
> > 
> > I see, so the problem seems to be that the fail-fast path of
> > maybe_constant_value isn't clearing TREE_CONSTANT sufficiently.  Would
> > it make sense to fix this like so?
> > 
> > -- >8 --
> > 
> > Subject: [PATCH] c++: ICE with non-constant satisfaction value [PR98644]
> > 
> > Here during satisfaction the expression of the atomic constraint after
> > substitution is (int *) NON_LVALUE_EXPR <1> != 0B, which is not a C++
> > constant expression due to the reinterpret_cast, but TREE_CONSTANT is
> > set since its value is otherwise effectively constant.  We then call
> > maybe_constant_value on it, which proceeds via its fail-fast path to
> > exit early without clearing TREE_CONSTANT.  But satisfy_atom relies
> > on checking TREE_CONSTANT of the result of maybe_constant_value in order
> > to detect non-constant satisfaction.
> > 
> > This patch fixes this by making the fail-fast path of maybe_constant_value
> > clear TREE_CONSTANT in this case, like cxx_eval_outermost_constant_expr
> > in the normal path would have done.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > PR c++/98644
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constexpr.cc (maybe_constant_value): In the fail-fast path,
> > clear TREE_CONSTANT on the result if it's set on the input.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp2a/concepts-pr98644.C: New test.
> > * g++.dg/parse/array-size2.C: Remove expected diagnostic about a
> > narrowing conversion.
> > ---
> >   gcc/cp/constexpr.cc   | 4 +++-
> >   gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C | 7 +++
> >   gcc/testsuite/g++.dg/parse/array-size2.C  | 2 --
> >   3 files changed, 10 insertions(+), 3 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C
> > 
> > diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> > index 4716694cb71..234cf0acc26 100644
> > --- a/gcc/cp/constexpr.cc
> > +++ b/gcc/cp/constexpr.cc
> > @@ -7965,8 +7965,10 @@ maybe_constant_value (tree t, tree decl, bool
> > manifestly_const_eval)
> >   if (!is_nondependent_constant_expression (t))
> >   {
> > -  if (TREE_OVERFLOW_P (t))
> > +  if (TREE_OVERFLOW_P (t)
> > + || (!processing_template_decl && TREE_CONSTANT (t)))
> > {
> > + /* This isn't actually constant, so unset TREE_CONSTANT.  */
> >   t = build_nop (TREE_TYPE (t), t);
> 
> build_nop isn't appropriate for arbitrary expressions (classes, in
> particular).  We probably want to factor out the code in
> cxx_eval_outermost_constant_expr under the "this isn't actually constant"
> comment.

Gotcha, makes sense.  Like the following?  Bootstrapped and regtested on
x86_64-pc-linux-gnu.

-- >8 --

Subject: 

Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]

2022-03-11 Thread Qing Zhao via Gcc-patches


> On Mar 10, 2022, at 8:54 PM, Xi Ruoyao  wrote:
> 
> On Thu, 2022-03-10 at 20:31 +, Qing Zhao wrote:
> 
 +  SET_HARD_REG_BIT (zeroed_hardregs, HI_REGNUM);
 +  if (TEST_HARD_REG_BIT (need_zeroed_hardregs, LO_REGNUM))
 +   SET_HARD_REG_BIT (zeroed_hardregs, LO_REGNUM);
 +  else
 +   emit_clobber (gen_rtx_REG (word_mode, LO_REGNUM));
>>> 
>>> …I don't think this conditional LO_REGNUM code is worth it.
>>> We might as well just add both registers to zeroed_hardregs.
>> 
>> If the LO_REGNUM is NOT in “need_zeroed_hardregs”, adding it to 
>> “zeroed_hardregs” seems not right to me.
>> What’s you mean by “not worth it”?
> 
> It's because the MIPS port almost always treat HI as "a subreg of dword
> HI-LO register".  A direct "mthi $0" is possible but MIPS backend does
> not recognize "emit_move_insn (HI, CONST_0)”.

Why there is “mthi $0” instruction, but there is NO emit_move_insn(HI, CONST_0)?
Is such mismatch a bug? If not, why? 

>  In theory it's possible
> to emit the mthi instruction explicitly here though, but we'll need to
> clear something NOT in need_zeroed_hardregs for MIPS anyway (see below).

One question here,  is there situation when only HI is cleared but LO is not 
cleared?
> 
>>> Here too I think we should just do:
>>> 
>>>  zeroed_hardregs |= reg_class_contents[ST_REGS] & accessible_reg_set;
>>> 
>>> to include all available FCC registers.
>> 
>> What’s the relationship between “ST_REGs” and FCC? (sorry for the stupid 
>> question since I am not familiar with the MIPS register set).
> 
> MIPS instruction manual names the 8 one-bit floating condition codes
> FCC0, ..., FCC7, but GCC MIPS backend code names the condition codes
> ST_REG0, ..., ST_REG7.  Maybe it's better to always use the name
> "ST_REG" instead of "FCC" then.
Okay, I see.  So, each ST_REGi register is a 1-bit pseudo register? But 
physically each of them is 1-bit in a physical register?
> 
>> From the above code, looks like that when any  “ST_REGs” is in 
>> “need_zeroed_hardregs”,FCC need to be cleared? 
> 
> Because there is no elegant way to clear one specific FCC bit in MIPS. 
> A "ctc1 $0, $25" instruction will zero them altogether.  If we really
> need to clear only one of them (let's say ST_REG3), we'll have to emit
> something like
> 
> mtc1  $0, $0   # zero FPR0 to ensure it won't contain sNaN
> c.f.s $3, $0, $0
> 
> Then we'll still need to clobber FPR0 with zero.  So anyway we'll have
> to clear some registers not specified in need_zeroed_hardregs.

So, “c.f.s” instruction can be used to clear ONLY one specific FCC bit? 
But you have to clear one FPR (floating pointer register?) first to avoid 
raising exception? 
My question here is:  is there a case when only FCC need to be cleared but no 
FPR need to be cleared? 

If NOT, then we can always pick one FPRi  before c.f.s to avoid the issue you 
mentioned (We’ll have to clear some registers not specified in 
need_zeroed_hardregs).
> 
> And the question is: is it really allowed to return something other than
> a subset of need_zeroed_hardregs for a TARGET_ZERO_CALL_USED_REGS hook?

Although currently there is no assertion added to force this requirement, I 
still think that we should keep it.

The “need_zeroed_hardregs” is computed based on 

1. User’s request from command line option;
2. Data flow info of the routine;
3. Abi info of the target;

If zero_call_used_regs target hook return registers out of 
“need_zeroed_hardregs” set, then it might out of the user’s exception, it 
should be considered as a bug, I think.

Qing
> If yes then we'll happily to do so (like how the v2 of the patch does),
> otherwise we'd need to clobber those registers NOT in
> need_zeroed_hardregs explicitly.
> -- 
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University



[PATCH][Middle-end][Backport to GCC11][PR100775]Updating the reg use in exit block for -fzero-call-used-regs

2022-03-11 Thread Qing Zhao via Gcc-patches


Hi, 

I plan to backport the patch to fix PR100775:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100775

To gcc11 since this is a general bug to -fzero-call-used-regs. And should be 
fixed in gcc11 as well.

I have tested the patch with gcc11 release branch on both x86 and aarch64, no 
regression.

Okay for commit?

thanks.

Qing.

==


>From 737a0b0824111f46da44c5578b9769070c52 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Thu, 10 Mar 2022 23:22:29 +
Subject: [PATCH] middle-end: updating the reg use in exit block for
 -fzero-call-used-regs [PR100775] GCC11 backport.

In the pass_zero_call_used_regs, when updating dataflow info after adding
the register zeroing sequence in the epilogue of the function, we should
call "df_update_exit_block_uses" to update the register use information in
the exit block to include all the registers that have been zeroed.

gcc/ChangeLog:

PR middle-end/100775
* function.c (gen_call_used_regs_seq): Call
df_update_exit_block_uses when updating df.

gcc/testsuite/ChangeLog:

PR middle-end/100775
* gcc.target/arm/pr100775.c: New test.
---
 gcc/function.c  | 2 +-
 gcc/testsuite/gcc.target/arm/pr100775.c | 9 +
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/pr100775.c

diff --git a/gcc/function.c b/gcc/function.c
index fc7b147b5f1..0495e9f1b81 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5922,7 +5922,7 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int 
zero_regs_type)
 
   /* Update the data flow information.  */
   crtl->must_be_zero_on_return |= zeroed_hardregs;
-  df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun));
+  df_update_exit_block_uses ();
 }
 }
 
diff --git a/gcc/testsuite/gcc.target/arm/pr100775.c 
b/gcc/testsuite/gcc.target/arm/pr100775.c
new file mode 100644
index 000..c648cd5e8f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr100775.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
+/* { dg-options "-mthumb -fzero-call-used-regs=used" } */
+
+int
+foo (int x)
+{
+  return x;
+}
-- 
2.27.0




Re: [PATCH] gcc: pass-manager: Fix memory leak. [PR jit/63854]

2022-03-11 Thread Marc Nieper-Wißkirchen
Hi Jeff and David,

any news on this fix?

Thanks,

Marc

Am Mo., 31. Jan. 2022 um 12:42 Uhr schrieb Marc Nieper-Wißkirchen
:
>
> Attached to this email is the patch updated to the recent renaming from *.c 
> to *.cc.
>
>
> Am So., 23. Jan. 2022 um 14:18 Uhr schrieb Marc Nieper-Wißkirchen 
> :
>>
>> Am Sa., 15. Jan. 2022 um 14:56 Uhr schrieb Marc Nieper-Wißkirchen
>> :
>> >
>> > Jeff, David, do you need any more input from my side?
>> >
>> > -- Marc
>> >
>> > Am Sa., 8. Jan. 2022 um 17:32 Uhr schrieb Jeff Law :
>> > >
>> > >
>> > >
>> > > On 1/6/2022 6:53 AM, David Malcolm via Gcc-patches wrote:
>> > > > On Sun, 2021-12-19 at 22:30 +0100, Marc Nieper-Wißkirchen wrote:
>> > > >> This patch fixes a memory leak in the pass manager. In the existing
>> > > >> code,
>> > > >> the m_name_to_pass_map is allocated in
>> > > >> pass_manager::register_pass_name, but
>> > > >> never deallocated.  This is fixed by adding a deletion in
>> > > >> pass_manager::~pass_manager.  Moreover the string keys in
>> > > >> m_name_to_pass_map are
>> > > >> all dynamically allocated.  To free them, this patch adds a new hash
>> > > >> trait for
>> > > >> string hashes that are to be freed when the corresponding hash entry
>> > > >> is removed.
>> > > >>
>> > > >> This fix is particularly relevant for using GCC as a library through
>> > > >> libgccjit.
>> > > >> The memory leak also occurs when libgccjit is instructed to use an
>> > > >> external
>> > > >> driver.
>> > > >>
>> > > >> Before the patch, compiling the hello world example of libgccjit with
>> > > >> the external driver under Valgrind shows a loss of 12,611 (48 direct)
>> > > >> bytes.  After the patch, no memory leaks are reported anymore.
>> > > >> (Memory leaks occurring when using the internal driver are mostly in
>> > > >> the driver code in gcc/gcc.c and have to be fixed separately.)
>> > > >>
>> > > >> The patch has been tested by fully bootstrapping the compiler with
>> > > >> the
>> > > >> frontends C, C++, Fortran, LTO, ObjC, JIT and running the test suite
>> > > >> under a x86_64-pc-linux-gnu host.
>> > > > Thanks for the patch.
>> > > >
>> > > > It looks correct to me, given that pass_manager::register_pass_name
>> > > > does an xstrdup and puts the result in the map.
>> > > >
>> > > > That said:
>> > > > - I'm not officially a reviewer for this part of gcc (though I probably
>> > > > touched this code last)
>> > > > - is it cleaner to instead change m_name_to_pass_map's key type from
>> > > > const char * to char *, to convey that the map "owns" the name?  That
>> > > > way we probably wouldn't need struct typed_const_free_remove, and (I
>> > > > hope) works better with the type system.
>> > > >
>> > > > Dave
>> > > >
>> > > >> gcc/ChangeLog:
>> > > >>
>> > > >>  PR jit/63854
>> > > >>  * hash-traits.h (struct typed_const_free_remove): New.
>> > > >>  (struct free_string_hash): New.
>> > > >>  * pass_manager.h: Use free_string_hash.
>> > > >>  * passes.c (pass_manager::register_pass_name): Use
>> > > >> free_string_hash.
>> > > >>  (pass_manager::~pass_manager): Delete allocated
>> > > >> m_name_to_pass_map.
>> > > My concern (and what I hadn't had time to dig into) was we initially
>> > > used nofree_string_hash -- I wanted to make sure there wasn't any path
>> > > where the name came from the stack (can't be free'd), was saved
>> > > elsewhere (danging pointer) and the like.  ie, why were we using
>> > > nofree_string_hash to begin with?  I've never really mucked around with
>> > > these bits, so the analysis side kept falling off the daily todo list.
>>
>> The only occurrences of m_name_to_pass_map are in pass-manager.h
>> (where it is defined as a private field of the class pass_manager) and
>> in passes.cc. There is just one instance where a name is added to the
>> map in passes.cc, namely through the put method. There, the name has
>> been xstrdup'ed.
>>
>> The name (as a const char *) escapes the pass map in
>> pass_manager::create_pass_tab through the call to
>> m_name_pass_map->traverse. This inserts the name into the pass_tab,
>> which is a static vec of const char *s. The pass_tab does not escape
>> the translation unit of passes.c. It is used in dump_one_pass where
>> the name is used as an argument to fprintf. The important point is
>> that it is not freed and not further copied.
>>
>> > >
>> > > If/once you're comfortable with the patch David, then go ahead and apply
>> > > it on Marc's behalf.
>> > >
>> > > jeff
>> > >


Re: [PATCH] Fix DImode to TImode sign extend issue, PR target/104868

2022-03-11 Thread Michael Meissner via Gcc-patches
Matheus Castanho reports that the patch I posted fixes the problem in the
1040868 bug report.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]

2022-03-11 Thread Xi Ruoyao via Gcc-patches
On Fri, 2022-03-11 at 16:08 +, Qing Zhao wrote:

> Why there is “mthi $0” instruction, but there is NO emit_move_insn(HI, 
> CONST_0)?
> Is such mismatch a bug? If not, why? 
> 
> >  In theory it's possible
> > to emit the mthi instruction explicitly here though, but we'll need to
> > clear something NOT in need_zeroed_hardregs for MIPS anyway (see below).
> 
> One question here,  is there situation when only HI is cleared but LO is not 
> cleared?

No, if I interpret the document of -fzero_call_used_regs and
attribute((zero_call_used_regs(...))) correctly.  A 2-reg multiplication
(or division) always set the value of both HI and LO.  Richard has added
a comment for this in mips.cc:

> 12868   /* After a multiplication or division, clobbering HI makes
> 1  the value of LO unpredictable, and vice versa.  This means
> 2  that, for all interesting cases, HI and LO are effectively
> 3  a single register.
> 4 
> 5  We model this by requiring that any value that uses HI
> 6  also uses LO.  */

This is also why the handling of emit_move_insn(HI, CONST_0) was
removed, I guess (the removal happened in the same commit adding this
comment).


> > > 
> Okay, I see.  So, each ST_REGi register is a 1-bit pseudo register?
> But physically each of them is 1-bit in a physical register?

Yes.

> > 
> > Because there is no elegant way to clear one specific FCC bit in MIPS. 
> > A "ctc1 $0, $25" instruction will zero them altogether.  If we really
> > need to clear only one of them (let's say ST_REG3), we'll have to emit
> > something like
> > 
> > mtc1  $0, $0   # zero FPR0 to ensure it won't contain sNaN
> > c.f.s $3, $0, $0
> > 
> > Then we'll still need to clobber FPR0 with zero.  So anyway we'll have
> > to clear some registers not specified in need_zeroed_hardregs.
> 
> So, “c.f.s” instruction can be used to clear ONLY one specific FCC bit? 
> But you have to clear one FPR (floating pointer register?) first to avoid 
> raising exception? 
> My question here is:  is there a case when only FCC need to be cleared but no 
> FPR need to be cleared? 

Yes, for example:

double a, b;

struct x
{
  double a, b;
};

struct x
f(void)
{
  struct x x =
{
  .a = a,
  .b = b
};
  if (a < b)
x.a = x.b;
  return x;
}

It does not need to zero the two FPRs, as they contain the return value.
But a FCC bit needs to be cleared.

> If NOT, then we can always pick one FPRi  before c.f.s to avoid the
> issue you mentioned (We’ll have to clear some registers not specified
> in need_zeroed_hardregs).

I'm now thinking: is there always at least one *GPR* which need to be
cleared?  If it's true, let's say GPR $12, and fcc0 & fcc2 needs to be
cleared, we can use something like:

cfc1 $12, $25
andi $25, 5
ctc1 $12, $25
move $12, $0

> > And the question is: is it really allowed to return something other than
> > a subset of need_zeroed_hardregs for a TARGET_ZERO_CALL_USED_REGS hook?
> 
> Although currently there is no assertion added to force this
> requirement, I still think that we should keep it.
> 
> The “need_zeroed_hardregs” is computed based on 
> 
> 1. User’s request from command line option;
> 2. Data flow info of the routine;
> 3. Abi info of the target;
> 
> If zero_call_used_regs target hook return registers out of
> “need_zeroed_hardregs” set, then it might out of the user’s exception,
> it should be considered as a bug, I think.

I have the same concern.  But now I'm too sleepy... Will try to improve
this tomorrow.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]

2022-03-11 Thread Xi Ruoyao via Gcc-patches
On Sat, 2022-03-12 at 01:29 +0800, Xi Ruoyao via Gcc-patches wrote:

> I'm now thinking: is there always at least one *GPR* which need to be
> cleared?  If it's true, let's say GPR $12, and fcc0 & fcc2 needs to be
> cleared, we can use something like:
> 
> cfc1 $12, $25
> andi $25, 5

$12, 5.

I can't type.

> ctc1 $12, $25
> move $12, $0

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[Committed] Update g++.dg/other/pr84964.C for ia32 (and similar) targets.

2022-03-11 Thread Roger Sayle

The "sorry, unimplemented" message in the new g++.dg/other/pr84964.C is
apparently dependent upon whether the target passes multi-gigabyte
arguments on the stack.  This tweaks the testcase to just confirm that
it no longer ICEs, not the specific set of warnings/errors triggered.

Committed as obvious.


2022-03-11  Roger Sayle  

gcc/testsuite/ChangeLog
PR c++/84964
* g++.dg/other/pr84964.C: Tweak test to check for the ICE, not for
the (target-dependent) sorry.

Sorry for the noise.
Roger
--

diff --git a/gcc/testsuite/g++.dg/other/pr84964.C 
b/gcc/testsuite/g++.dg/other/pr84964.C
index 0f2f6f3..48cbefb 100644
--- a/gcc/testsuite/g++.dg/other/pr84964.C
+++ b/gcc/testsuite/g++.dg/other/pr84964.C
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 
 struct a {
-  short b : -1ULL;  // { dg-warning "exceeds its type" }
+  short b : -1ULL;
 };
-void c(...) { c(a()); }  // { dg-message "sorry, unimplemented" }
-
+void c(...) { c(a()); }
+// { dg-excess-errors "" }


Re: [committed] libstdc++: Move closing brace outside #endif [PR104866]

2022-03-11 Thread Jonathan Wakely via Gcc-patches
On Fri, 11 Mar 2022 at 15:35, Detlef Vollmann wrote:
>
> On 3/11/22 16:12, Jonathan Wakely wrote:
> > On Fri, 11 Mar 2022 at 14:28, Detlef Vollmann wrote:
> >> With this and the other fix I was able to build the complete
> >> libstdc++ for AVR based on AVR Libc 2.0 (plus some ad hoc
> >> AVR header fixes) from git master 5e28be89.
> >
> > Nice. I finally figure out that I need to use --enable-libstdcxx *not*
> > --enable-libstdc++-v3 to build for AVR,
> Yes, I had the same problem.
> A comment in the 'configure' script still says libstdc++-v3

Yes, I have a patch to fix that.


> and in the configure docs at
> 
> or
> 
> there's neither :-(
>
> > and now I get errors due to
> > EOVERFLOW being undefined. Is that what you fixed?
> One of them.
>
> > We should make that work. Arguably, all values of std::errc should
> > exist, even if the OS  doesn't provide a constant. We could
> > define the missing ones ourselves, choosing numbers > 1000 (and hope
> > the OS uses small numbers for its own errno values).
> I simply defined all that were required.
>
> I've attached a tarball with all my header fixes.

Thanks. Now I'm getting a build failure because libtol wasn't created
in the avr/libstdc++-v3 directory of the build tree, but I'll have to
look into that next week.

/bin/sh: ../libtool: No such file or directory



> To get them picked up while compiling libstdc++ I had to
> put them into ${prefix}/avr/lib/include.
>
> But these are really ad hoc, some of the problems I think should
> be fixed in the libstdc++ sources.
> E.g. I think it's wrong to expect that specific functions are
> available if a respective header is available (e.g. close()
> in unistd.h).

Yes, that was me being lazy.



[PATCH] libstdc++: Ensure that std::from_chars is declared when supported

2022-03-11 Thread Jonathan Wakely via Gcc-patches
Patrick, I think this is right, but please take a look to double check.

I think we should fix the feature-test macro conditions for gcc-11 too,
although it's a bit more complicated there. It should depend on IEEE
float and double *and* uselocale. We don't need the other changes on the
branch.


-- >8 --

This adjusts the declarations in  to match when the definition
is present. This solves the issue that std::from_chars is present on
Solaris 11.3 (using fast_float) but was not declared in the header
(because the declarations were guarded by _GLIBCXX_HAVE_USELOCALE).

Additionally, do not define __cpp_lib_to_chars unless both from_chars
and to_chars are supported (which is only true for IEEE float and
double). We might still provide from_chars (via strtold) but if to_chars
isn't provided, we shouldn't define the feature test macro.

Finally, this simplifies some of the preprocessor checks in the bodies
of std::from_chars in src/c++17/floating_from_chars.cc and hoists the
repeated code for the strtod version into a new function template.

libstdc++-v3/ChangeLog:

* include/std/charconv (__cpp_lib_to_chars): Only define when
both from_chars and to_chars are supported for floating-point
types.
(from_chars, to_chars): Adjust preprocessor conditions guarding
declarations.
* include/std/version (__cpp_lib_to_chars): Adjust condition to
match  definition.
* src/c++17/floating_from_chars.cc (from_chars_strtod): New
function template.
(from_chars): Simplify preprocessor checks and use
from_chars_strtod when appropriate.
---
 libstdc++-v3/include/std/charconv |   8 +-
 libstdc++-v3/include/std/version  |   3 +-
 libstdc++-v3/src/c++17/floating_from_chars.cc | 120 ++
 3 files changed, 45 insertions(+), 86 deletions(-)

diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index a3f8c7718b2..2ce9c7d4cb9 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -43,7 +43,8 @@
 #include  // for std::errc
 #include 
 
-#if _GLIBCXX_HAVE_USELOCALE
+#if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 \
+&& __SIZE_WIDTH__ >= 32
 # define __cpp_lib_to_chars 201611L
 #endif
 
@@ -686,7 +687,7 @@ namespace __detail
   operator^=(chars_format& __lhs, chars_format __rhs) noexcept
   { return __lhs = __lhs ^ __rhs; }
 
-#if _GLIBCXX_HAVE_USELOCALE
+#if defined __cpp_lib_to_chars || _GLIBCXX_HAVE_USELOCALE
   from_chars_result
   from_chars(const char* __first, const char* __last, float& __value,
 chars_format __fmt = chars_format::general) noexcept;
@@ -700,8 +701,7 @@ namespace __detail
 chars_format __fmt = chars_format::general) noexcept;
 #endif
 
-#if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 \
-&& __SIZE_WIDTH__ >= 32
+#if defined __cpp_lib_to_chars
   // Floating-point std::to_chars
 
   // Overloads for float.
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index 461e65b5fab..d730a7ea3c7 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -171,7 +171,8 @@
 #endif
 #define __cpp_lib_shared_ptr_weak_type 201606L
 #define __cpp_lib_string_view 201803L
-#if _GLIBCXX_HAVE_USELOCALE
+#if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 \
+&& __SIZE_WIDTH__ >= 32
 # define __cpp_lib_to_chars 201611L
 #endif
 #define __cpp_lib_unordered_map_try_emplace 201411L
diff --git a/libstdc++-v3/src/c++17/floating_from_chars.cc 
b/libstdc++-v3/src/c++17/floating_from_chars.cc
index ba0426b3344..4aa2483bc28 100644
--- a/libstdc++-v3/src/c++17/floating_from_chars.cc
+++ b/libstdc++-v3/src/c++17/floating_from_chars.cc
@@ -65,6 +65,7 @@ extern "C" __ieee128 __strtoieee128(const char*, char**);
 && __SIZE_WIDTH__ >= 32
 # define USE_LIB_FAST_FLOAT 1
 # if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__
+// No need to use strtold.
 #  undef USE_STRTOD_FOR_FROM_CHARS
 # endif
 #endif
@@ -420,6 +421,33 @@ namespace
 return true;
   }
 #endif
+
+  template
+  from_chars_result
+  from_chars_strtod(const char* first, const char* last, T& value,
+   chars_format fmt) noexcept
+  {
+errc ec = errc::invalid_argument;
+#if _GLIBCXX_USE_CXX11_ABI
+buffer_resource mr;
+pmr::string buf(&mr);
+#else
+string buf;
+if (!reserve_string(buf))
+  return make_result(first, 0, {}, ec);
+#endif
+size_t len = 0;
+__try
+  {
+   if (const char* pat = pattern(first, last, fmt, buf)) [[likely]]
+ len = from_chars_impl(pat, value, ec);
+  }
+__catch (const std::bad_alloc&)
+  {
+   fmt = chars_format{};
+  }
+return make_result(first, len, fmt, ec);
+  }
 #endif // USE_STRTOD_FOR_FROM_CHARS
 
 #if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64
@@ -793,35 +821,15 @@ from_chars_result
 from_chars(const char* first

[PATCH] top-level: Fix comment about --enable-libstdcxx in configure

2022-03-11 Thread Jonathan Wakely via Gcc-patches
I'm going to push this as obvious, but do I need to do anything special
to sync it with binutils, or will that happen next time somebody needs a
proper fix?

-- >8 --

The custom option for enabling/disabling libstdc++ is not spelled the
same as the directory name:

AC_ARG_ENABLE(libstdcxx,
AS_HELP_STRING([--disable-libstdcxx],
  [do not build libstdc++-v3 directory])

The comment referring to it later use the wrong name.

ChangeLog:

* configure.ac: Fix incorrect option in comment.
* configure: Regenerate.
---
 configure| 2 +-
 configure.ac | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 9c2d7df1bb2..f7e0fa46c9c 100755
--- a/configure
+++ b/configure
@@ -3390,7 +3390,7 @@ case "${target}" in
 esac
 
 # Disable libstdc++-v3 for some systems.
-# Allow user to override this if they pass --enable-libstdc++-v3
+# Allow user to override this if they pass --enable-libstdcxx
 if test "${ENABLE_LIBSTDCXX}" = "default" ; then
   case "${target}" in
 *-*-vxworks*)
diff --git a/configure.ac b/configure.ac
index 68cc5cc31fe..434b1a267a4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -649,7 +649,7 @@ case "${target}" in
 esac
 
 # Disable libstdc++-v3 for some systems.
-# Allow user to override this if they pass --enable-libstdc++-v3
+# Allow user to override this if they pass --enable-libstdcxx
 if test "${ENABLE_LIBSTDCXX}" = "default" ; then
   case "${target}" in
 *-*-vxworks*)
-- 
2.34.1



Re: [committed] libstdc++: Move closing brace outside #endif [PR104866]

2022-03-11 Thread Detlef Vollmann

On 3/11/22 18:59, Jonathan Wakely wrote:


Thanks. Now I'm getting a build failure because libtol wasn't created
in the avr/libstdc++-v3 directory of the build tree, but I'll have to
look into that next week.

/bin/sh: ../libtool: No such file or directory


Here's my configure call:

$REPO_DIR/configure \
--prefix=$PREFIX \
--target=avr \
--enable-languages=c,c++ \
--with-dwarf2 \
--enable-multilib \
--enable-libstdcxx \
--disable-decimal-float \
--disable-libffi \
--disable-libgomp \
--disable-libmudflap \
--disable-libquadmath \
--disable-libssp \
--disable-libstdcxx-pch \
--disable-nls \
--without-included-gettext \
--disable-libstdcxx-verbose \
--disable-shared \
--disable-threads \
--disable-tls \
--disable-plugin \
--with-system-zlib \
--with-native-system-header-dir=$PREFIX/port/include \
--with-headers=yes \
--with-gnu-as \
--with-gnu-ld \
--with-avrlibc \
--with-build-time-tools=$PREFIX/lib/avr/bin

--disable-threads is probably wrong, as I definitly have threads
(any ISR counts as thread).  I added it in the (wrong) assumption
that it would build support for std::thread...

  Detlef


Re: [wwwdocs PATCH v2] gcc-12: Mention -mno-direct-extern-access

2022-03-11 Thread H.J. Lu via Gcc-patches
On Wed, Feb 16, 2022 at 5:28 AM H.J. Lu  wrote:
>
> ---
>  htdocs/gcc-12/changes.html | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
> index b6341fda..7d253f29 100644
> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html
> @@ -399,6 +399,10 @@ a work-in-progress.
>Add CS prefix to call and jmp to indirect thunk with branch target
>in r8-r15 registers via -mindirect-branch-cs-prefix.
>
> +  Always use global offset table (GOT) to access external data and
> +  function symbols when the new -mno-direct-extern-access
> +  command-line option is specified.
> +  
>  
>
>  
> --
> 2.35.1
>

I am checking it in.

-- 
H.J.


Re: [PATCH] Fix DImode to TImode sign extend issue, PR target/104868

2022-03-11 Thread Segher Boessenkool
On Fri, Mar 11, 2022 at 01:07:29AM -0500, Michael Meissner wrote:
> Fix DImode to TImode sign extend issue, PR target/104898

> When I wrote the extendditi2 pattern, I forgot that mtvsrdd had that
> behavior so I used a 'r' constraint instead of 'b'.  In the rare case
> where the value is in GPR register 0, this split will fail.

Note that the machine instructions it would generate would work fine:
mtvsrdd X,0,Y can be used as a "mtvsrld" always.  In fact, generating
such code would be better than mtvsrdd always here.

Do you want to try that?  If not, this is okay for trunk.  Thanks!


Segher


Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-03-11 Thread Joseph Myers
The version of this patch applied to GCC 10 branch (commit 
641b407763ecfee5d4ac86d8ffe9eb1eeea5fd10) has broken the glibc build for 
powerpc64le-linux-gnu (it's fine with GCC 11 branch and master, just GCC 
10 branch is broken) 
.  
Specifically, the test program links-dso-program.cc built during the glibc 
build no longer builds, with a series of errors in libstdc++ headers such 
as:

/scratch/jmyers/glibc/many10/install/compilers/powerpc64le-linux-gnu/powerpc64le-glibc-linux-gnu/include/c++/10.3.1/type_traits:387:39:
 error: '__float128' was not declared in this scope
  387 | struct __is_floating_point_helper<__float128>
  |   ^~

So it appears that with the GCC 10 version, there is some inconsistency 
between what the compiler defines and what the headers expect.

This file in glibc is built with the following command:

powerpc64le-glibc-linux-gnu-g++ links-dso-program.cc -c 
-I/scratch/jmyers/glibc/many10/build/glibcs/powerpc64le-linux-gnu/glibc/  
-g -O2 -Wall -Wwrite-strings -Wundef -Werror -fmerge-all-constants 
-frounding-math -fno-stack-protector -fno-common -mabi=ieeelongdouble 
-Wno-psabi -mno-gnu-attribute -mlong-double-128  -fpie -mno-float128  
-I../include 
-I/scratch/jmyers/glibc/many10/build/glibcs/powerpc64le-linux-gnu/glibc/support 
 
-I/scratch/jmyers/glibc/many10/build/glibcs/powerpc64le-linux-gnu/glibc  
-I../sysdeps/unix/sysv/linux/powerpc/powerpc64/le/fpu  
-I../sysdeps/unix/sysv/linux/powerpc/powerpc64/fpu  
-I../sysdeps/unix/sysv/linux/powerpc/powerpc64/le  
-I../sysdeps/unix/sysv/linux/powerpc/powerpc64  
-I../sysdeps/unix/sysv/linux/wordsize-64  
-I../sysdeps/unix/sysv/linux/powerpc  -I../sysdeps/powerpc/nptl  
-I../sysdeps/unix/sysv/linux/include -I../sysdeps/unix/sysv/linux  
-I../sysdeps/nptl  -I../sysdeps/pthread  -I../sysdeps/gnu  
-I../sysdeps/unix/inet  -I../sysdeps/unix/sysv  -I../sysdeps/unix/powerpc  
-I../sysdeps/unix  -I../sysdeps/posix  
-I../sysdeps/powerpc/powerpc64/le/power8/fpu/multiarch  
-I../sysdeps/powerpc/powerpc64/le/power7/fpu/multiarch  
-I../sysdeps/powerpc/powerpc64/le/fpu/multiarch  
-I../sysdeps/powerpc/powerpc64/le/power8/fpu  
-I../sysdeps/powerpc/powerpc64/le/power7/fpu  
-I../sysdeps/powerpc/powerpc64/le/fpu  -I../sysdeps/powerpc/powerpc64/fpu  
-I../sysdeps/powerpc/powerpc64/le/power8/multiarch  
-I../sysdeps/powerpc/powerpc64/le/power7/multiarch  
-I../sysdeps/powerpc/powerpc64/le/multiarch  
-I../sysdeps/powerpc/powerpc64/multiarch  
-I../sysdeps/powerpc/powerpc64/le/power8  
-I../sysdeps/powerpc/powerpc64/power8  
-I../sysdeps/powerpc/powerpc64/le/power7  
-I../sysdeps/powerpc/powerpc64/power7  
-I../sysdeps/powerpc/powerpc64/power6  
-I../sysdeps/powerpc/powerpc64/power4  -I../sysdeps/powerpc/power4  
-I../sysdeps/powerpc/powerpc64/le  -I../sysdeps/powerpc/powerpc64  
-I../sysdeps/wordsize-64  -I../sysdeps/powerpc/fpu  -I../sysdeps/powerpc  
-I../sysdeps/ieee754/ldbl-128ibm-compat  
-I../sysdeps/ieee754/ldbl-128ibm/include -I../sysdeps/ieee754/ldbl-128ibm  
-I../sysdeps/ieee754/ldbl-opt  -I../sysdeps/ieee754/dbl-64  
-I../sysdeps/ieee754/flt-32  -I../sysdeps/ieee754/float128  
-I../sysdeps/ieee754  -I../sysdeps/generic  -I.. -I../libio -I.   
-D_LIBC_REENTRANT -include 
/scratch/jmyers/glibc/many10/build/glibcs/powerpc64le-linux-gnu/glibc/libc-modules.h
 
-DMODULE_NAME=nonlib -include ../include/libc-symbols.h  -DPIC 
-DTOP_NAMESPACE=glibc -o 
/scratch/jmyers/glibc/many10/build/glibcs/powerpc64le-linux-gnu/glibc/support/links-dso-program.o
 
-MD -MP -MF 
/scratch/jmyers/glibc/many10/build/glibcs/powerpc64le-linux-gnu/glibc/support/links-dso-program.o.dt
 
-MT 
/scratch/jmyers/glibc/many10/build/glibcs/powerpc64le-linux-gnu/glibc/support/links-dso-program.o

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-03-11 Thread Segher Boessenkool
On Fri, Mar 11, 2022 at 08:42:27PM +, Joseph Myers wrote:
> The version of this patch applied to GCC 10 branch (commit 
> 641b407763ecfee5d4ac86d8ffe9eb1eeea5fd10) has broken the glibc build for 
> powerpc64le-linux-gnu (it's fine with GCC 11 branch and master, just GCC 
> 10 branch is broken) 

Mike, please revert it then?


Segher


Re: [PATCH] c : Changed warning message for -Wstrict-prototypes [PR92209]

2022-03-11 Thread Joseph Myers
On Fri, 11 Mar 2022, Krishna Narayanan via Gcc-patches wrote:

> Hello,
> The following is a patch for the PR92209,which gives a warning when
> the function prototype does not specify its argument type.In this
> patch there has been a change in the warning message displayed for
> -Wstrict-prototypes to specify its argument types.I have also added
> the testcase for it.
> Regtested on x86_64,OK for commit? Please do review it.

Why do you think your proposed wording is better than the existing 
wording?  I think the existing wording is accurate and the proposed 
wording is inaccurate - "must specify the argument types" is not an 
accurate description of any requirement in the C language, using "must" at 
all generally seems questionable in the wording of a warning message.

Also, I don't think this change is anything to do with the PR you mention 
("Imprecise column number for -Wstrict-prototypes"), so it's wrong to 
mention that PR number in the proposed commit message.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-03-11 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 11, 2022 at 02:51:23PM -0600, Segher Boessenkool wrote:
> On Fri, Mar 11, 2022 at 08:42:27PM +, Joseph Myers wrote:
> > The version of this patch applied to GCC 10 branch (commit 
> > 641b407763ecfee5d4ac86d8ffe9eb1eeea5fd10) has broken the glibc build for 
> > powerpc64le-linux-gnu (it's fine with GCC 11 branch and master, just GCC 
> > 10 branch is broken) 
> 
> Mike, please revert it then?

Preferably also the GCC 11 commit, because otherwise it needs backports
of the follow-up changes too.

Jakub



Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-03-11 Thread Segher Boessenkool
On Fri, Mar 11, 2022 at 09:57:50PM +0100, Jakub Jelinek wrote:
> On Fri, Mar 11, 2022 at 02:51:23PM -0600, Segher Boessenkool wrote:
> > On Fri, Mar 11, 2022 at 08:42:27PM +, Joseph Myers wrote:
> > > The version of this patch applied to GCC 10 branch (commit 
> > > 641b407763ecfee5d4ac86d8ffe9eb1eeea5fd10) has broken the glibc build for 
> > > powerpc64le-linux-gnu (it's fine with GCC 11 branch and master, just GCC 
> > > 10 branch is broken) 
> > 
> > Mike, please revert it then?
> 
> Preferably also the GCC 11 commit, because otherwise it needs backports
> of the follow-up changes too.

Good point.  Yes please.


Segher


Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]

2022-03-11 Thread Qing Zhao via Gcc-patches
Hi, Ruoyao,

(I might not be able to reply to this thread till next Wed due to a short 
vacation).

First, some comments on opening bugs against Gcc:

I took a look at the bug reports PR104817 and PR104820:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104820
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104817

I didn’t see a testing case and a script to repeat the error, so I cannot 
repeat the error at my side.
So, in order for other people to help to study the bug, you need to provide a 
testing case and
A script or detailed description on how to repeat the bug. 

In addition to that, it will be helpful if you can provide more details on 
what’ the root cause of the
Issue from your study into the comment part of the bug report. 


> On Mar 11, 2022, at 11:29 AM, Xi Ruoyao  wrote:
> 
> On Fri, 2022-03-11 at 16:08 +, Qing Zhao wrote:
> 
>> Why there is “mthi $0” instruction, but there is NO emit_move_insn(HI, 
>> CONST_0)?
>> Is such mismatch a bug? If not, why? 
>> 
>>>  In theory it's possible
>>> to emit the mthi instruction explicitly here though, but we'll need to
>>> clear something NOT in need_zeroed_hardregs for MIPS anyway (see below).
>> 
>> One question here,  is there situation when only HI is cleared but LO is not 
>> cleared?
> 
> No, if I interpret the document of -fzero_call_used_regs and
> attribute((zero_call_used_regs(...))) correctly.  A 2-reg multiplication
> (or division) always set the value of both HI and LO.  Richard has added
> a comment for this in mips.cc:
> 
>> 12868   /* After a multiplication or division, clobbering HI makes
>>1  the value of LO unpredictable, and vice versa.  This means
>>2  that, for all interesting cases, HI and LO are effectively
>>3  a single register.
>>4 
>>5  We model this by requiring that any value that uses HI
>>6  also uses LO.  */
> 
> This is also why the handling of emit_move_insn(HI, CONST_0) was
> removed, I guess (the removal happened in the same commit adding this
> comment).

Okay. I see. 
Then the current handling for HI_REGNUM is reasonable. I suggest to add one 
assertion inside the handling of HI_REGNUM with proper comment:

gcc_assertion (TEST_HARD_REG_BIT (need_zeroed_hardregs, LO_REGNUM));

to catch any unexpected bug. 

Richard, what’s your opinion on this?

> 
> 
 
>> Okay, I see.  So, each ST_REGi register is a 1-bit pseudo register?
>> But physically each of them is 1-bit in a physical register?
> 
> Yes.
> 
>>> 
>>> Because there is no elegant way to clear one specific FCC bit in MIPS. 
>>> A "ctc1 $0, $25" instruction will zero them altogether.  If we really
>>> need to clear only one of them (let's say ST_REG3), we'll have to emit
>>> something like
>>> 
>>> mtc1  $0, $0   # zero FPR0 to ensure it won't contain sNaN
>>> c.f.s $3, $0, $0
>>> 
>>> Then we'll still need to clobber FPR0 with zero.  So anyway we'll have
>>> to clear some registers not specified in need_zeroed_hardregs.
>> 
>> So, “c.f.s” instruction can be used to clear ONLY one specific FCC bit? 
>> But you have to clear one FPR (floating pointer register?) first to avoid 
>> raising exception? 
>> My question here is:  is there a case when only FCC need to be cleared but 
>> no FPR need to be cleared? 
> 
> Yes, for example:
> 
> double a, b;
> 
> struct x
> {
>  double a, b;
> };
> 
> struct x
> f(void)
> {
>  struct x x =
>{
>  .a = a,
>  .b = b
>};
>  if (a < b)
>x.a = x.b;
>  return x;
> }
> 
> It does not need to zero the two FPRs, as they contain the return value.
> But a FCC bit needs to be cleared.
Okay.

> 
>> If NOT, then we can always pick one FPRi  before c.f.s to avoid the
>> issue you mentioned (We’ll have to clear some registers not specified
>> in need_zeroed_hardregs).
> 
> I'm now thinking: is there always at least one *GPR* which need to be
> cleared?  If it's true, let's say GPR $12, and fcc0 & fcc2 needs to be
> cleared, we can use something like:

So, you mean, in order to set one FCC bit to zero, we have to set another GPR 
or FPR to zero first? Otherwise an error might occur? 
Why? (This is unreasonable to me) do I miss anything here?

Qing
> cfc1 $12, $25
> andi $25, 5
> ctc1 $12, $25
> move $12, $0
> 
>>> And the question is: is it really allowed to return something other than
>>> a subset of need_zeroed_hardregs for a TARGET_ZERO_CALL_USED_REGS hook?
>> 
>> Although currently there is no assertion added to force this
>> requirement, I still think that we should keep it.
>> 
>> The “need_zeroed_hardregs” is computed based on 
>> 
>> 1. User’s request from command line option;
>> 2. Data flow info of the routine;
>> 3. Abi info of the target;
>> 
>> If zero_call_used_regs target hook return registers out of
>> “need_zeroed_hardregs” set, then it might out of the user’s exception,
>> it should be considered as a bug, I think.
> 
> I have the same concern.  But now I'm too sleepy... Will try to improve
> this tomorrow.
> -- 
> Xi Ruoyao 

[PATCH] rs6000: Do not use rs6000_cpu for .machine ppc and ppc64 (PR104829)

2022-03-11 Thread Segher Boessenkool
Fixes: 77eccbf39ed5

rs6000.h has
  #define PROCESSOR_POWERPC   PROCESSOR_PPC604
  #define PROCESSOR_POWERPC64 PROCESSOR_RS64A
which means that if you use things like  -mcpu=powerpc -mvsx  it will no
longer work after my latest .machine patch.  This causes GCC build errors
in some cases, not a good idea (even if the errors are actually
pre-existing: using -mvsx with a machine that does not have VXX cannot
work properly).

Will commit later today (if it regstraps fine :-) )


Segher


2022-03-11  Segher Boessenkool  

PR target/104829
* config/rs6000/rs6000.cc (rs6000_machine_from_flags): Don't output
"ppc" and "ppc64" based on rs6000_cpu.

---
 gcc/config/rs6000/rs6000.cc | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 3afe78f5d049..5ebe19022473 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -5804,20 +5804,28 @@ rs6000_machine_from_flags (void)
   if (rs6000_cpu == PROCESSOR_MPCCORE)
 return "\"821\"";
 
+#if 0
+  /* This (and ppc64 below) are disabled here (for now at least) because
+ TARGET_POWERPC, TARGET_POWERPC64, and TARGET_COMMON are #define'd as
+ some of these.  Untangling that is a job for later.  */
+
   /* 600 series and 700 series, "classic" */
   if (rs6000_cpu == PROCESSOR_PPC601 || rs6000_cpu == PROCESSOR_PPC603
   || rs6000_cpu == PROCESSOR_PPC604 || rs6000_cpu == PROCESSOR_PPC604e
-  || rs6000_cpu == PROCESSOR_PPC750 || rs6000_cpu == PROCESSOR_POWERPC)
+  || rs6000_cpu == PROCESSOR_PPC750)
 return "ppc";
+#endif
 
   /* Classic with AltiVec, "G4" */
   if (rs6000_cpu == PROCESSOR_PPC7400 || rs6000_cpu == PROCESSOR_PPC7450)
 return "\"7450\"";
 
+#if 0
   /* The older 64-bit CPUs */
   if (rs6000_cpu == PROCESSOR_PPC620 || rs6000_cpu == PROCESSOR_PPC630
-  || rs6000_cpu == PROCESSOR_RS64A || rs6000_cpu == PROCESSOR_POWERPC64)
+  || rs6000_cpu == PROCESSOR_RS64A)
 return "ppc64";
+#endif
 
   HOST_WIDE_INT flags = rs6000_isa_flags;
 
-- 
1.8.3.1



Re: [PATCH v7] c++: Add diagnostic when operator= is used as truth cond [PR25689]

2022-03-11 Thread Jason Merrill via Gcc-patches

On 2/17/22 23:30, Zhao Wei Liew wrote:

On Fri, 18 Feb 2022 at 08:32, Zhao Wei Liew  wrote:



+/* Test non-empty class */
+void f2(B b1, B b2)
+{
+ if (b1 = 0); /* { dg-warning "suggest parentheses" } */
+ if (b1 = 0.); /* { dg-warning "suggest parentheses" } */
+ if (b1 = b2); /* { dg-warning "suggest parentheses" } */
+ if (b1.operator= (0));
+
+ /* Ideally, we wouldn't warn for non-empty classes using trivial
+  operator= (below), but we currently do as it is a MODIFY_EXPR. */
+ // if (b1.operator= (b2));


You can avoid it by calling suppress_warning on that MODIFY_EXPR in
build_over_call.


Unfortunately, that also affects the warning for if (b1 = b2) just 5
lines above. Both expressions seem to generate the same tree structure.


True, you would need to put the call to suppress_warning in build_new_op
around where CALL_EXPR_OPERATOR_SYNTAX is set.


It seems like that would suppress the warning for the case of if (b1 = b2) 
instead of
if (b1.operator= (b2)). Do you mean to add the call to suppress_warning
in build_method_call instead?

This is what I've tried so far:

1. Call suppress_warning (result, ...) in the trivial_fn_p block in 
build_new_op,
right above the comment "There won't be a CALL_EXPR" (line 6699).
This suppresses the warning for if (b1 = b2) but not for if (b1.operator= 
(b2)).

2. Call suppress_warning (result, ...) in build_method_call, right after the 
call to
 build_over_call (line 11141). This suppresses the warning for if 
(b1.operator= (b2))
 and not if (b1 = b2).

Based on this, I think the 2nd option might be what we want here? Please 
correct me if I'm
wrong. I'm also unsure if there are issues that might arise with this change.


To better illustrate the 2nd option, I've attached it as a patch v8.
How does it look?


It looks good, but unfortunately regresses some other warning tests, 
such as Wnonnull5.C.  Please remember to run the regression tests before 
sending a patch (https://gcc.gnu.org/contribute.html#testing).


This seems to be a complicated problem with suppress_warning, which 
means your call to suppress_warning effectively silences all later 
warnings, not just -Wparentheses.


You should be able to work around this issue by only calling 
suppress_warning in the specific case we're interested in, i.e. when 
warn_parentheses is enabled and "call" is a MODIFY_EXPR.



v7: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590464.html
Changes since v7:
1. Suppress -Wparentheses warnings in build_new_method_call.
2. Uncomment the test case for if (b1.operator= (b2)).

v6: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590419.html
Changes since v6:
1. Check for error_mark_node in is_assignment_op_expr_pr.
2. Change "c:" to "c++:".

v5: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590393.html
Changes since v5:
1. Revert changes in v4.
2. Replace gcc_assert with a return NULL_TREE in extract_call_expr.

v4: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590379.html
Changes since v4:
1. Refactor the non-assert-related code out of extract_call_expr and
call that function instead to check for call expressions.

v3: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590310.html
Changes since v3:
1. Also handle COMPOUND_EXPRs and TARGET_EXPRs.

v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590236.html
Changes since v2:
1. Add more test cases in Wparentheses-31.C.
2. Refactor added logic to a function (is_assignment_overload_ref_p).
3. Use REFERENCE_REF_P instead of INDIRECT_REF_P.

v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590158.html
Changes since v1:
1. Use CALL_EXPR_OPERATOR_SYNTAX to avoid warnings for explicit
operator=() calls.
2. Use INDIRECT_REF_P to filter implicit operator=() calls.
3. Use cp_get_callee_fndecl_nofold.
4. Add spaces before (.




Re: [PATCH] c++: Fix ICE with non-constant satisfaction [PR98644]

2022-03-11 Thread Jason Merrill via Gcc-patches

On 3/11/22 11:46, Patrick Palka wrote:

On Thu, 10 Mar 2022, Jason Merrill wrote:


On 3/1/22 00:10, Patrick Palka wrote:

On Tue, 19 Jan 2021, Jason Merrill wrote:


On 1/13/21 12:05 PM, Patrick Palka wrote:

In the below testcase, the expression of the atomic constraint after
substitution is (int *) NON_LVALUE_EXPR <1> != 0B which is not a C++
constant expression, but its TREE_CONSTANT flag is set (from build2),
so satisfy_atom fails to notice that it's non-constant (and we end
up tripping over the assert in satisfaction_value).

Since TREE_CONSTANT doesn't necessarily correspond to C++ constantness,
this patch makes satisfy_atom instead check
is_rvalue_constant_expression.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/10?

gcc/cp/ChangeLog:

PR c++/98644
* constraint.cc (satisfy_atom): Check is_rvalue_constant_expression
instead of TREE_CONSTANT.

gcc/testsuite/ChangeLog:

PR c++/98644
* g++.dg/cpp2a/concepts-pr98644.C: New test.
---
gcc/cp/constraint.cc  | 2 +-
gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C | 7 +++
2 files changed, 8 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 9049d087859..f99a25dc8a4 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2969,7 +2969,7 @@ satisfy_atom (tree t, tree args, sat_info info)
{
  result = maybe_constant_value (result, NULL_TREE,
 /*manifestly_const_eval=*/true);
-  if (!TREE_CONSTANT (result))


This should be sufficient.  If the result isn't constant,
maybe_constant_value
shouldn't return it with TREE_CONSTANT set.  See


/* This isn't actually constant, so unset TREE_CONSTANT.


in cxx_eval_outermost_constant_expr.


I see, so the problem seems to be that the fail-fast path of
maybe_constant_value isn't clearing TREE_CONSTANT sufficiently.  Would
it make sense to fix this like so?

-- >8 --

Subject: [PATCH] c++: ICE with non-constant satisfaction value [PR98644]

Here during satisfaction the expression of the atomic constraint after
substitution is (int *) NON_LVALUE_EXPR <1> != 0B, which is not a C++
constant expression due to the reinterpret_cast, but TREE_CONSTANT is
set since its value is otherwise effectively constant.  We then call
maybe_constant_value on it, which proceeds via its fail-fast path to
exit early without clearing TREE_CONSTANT.  But satisfy_atom relies
on checking TREE_CONSTANT of the result of maybe_constant_value in order
to detect non-constant satisfaction.

This patch fixes this by making the fail-fast path of maybe_constant_value
clear TREE_CONSTANT in this case, like cxx_eval_outermost_constant_expr
in the normal path would have done.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/98644

gcc/cp/ChangeLog:

* constexpr.cc (maybe_constant_value): In the fail-fast path,
clear TREE_CONSTANT on the result if it's set on the input.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-pr98644.C: New test.
* g++.dg/parse/array-size2.C: Remove expected diagnostic about a
narrowing conversion.
---
   gcc/cp/constexpr.cc   | 4 +++-
   gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C | 7 +++
   gcc/testsuite/g++.dg/parse/array-size2.C  | 2 --
   3 files changed, 10 insertions(+), 3 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 4716694cb71..234cf0acc26 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -7965,8 +7965,10 @@ maybe_constant_value (tree t, tree decl, bool
manifestly_const_eval)
   if (!is_nondependent_constant_expression (t))
   {
-  if (TREE_OVERFLOW_P (t))
+  if (TREE_OVERFLOW_P (t)
+ || (!processing_template_decl && TREE_CONSTANT (t)))
{
+ /* This isn't actually constant, so unset TREE_CONSTANT.  */
  t = build_nop (TREE_TYPE (t), t);


build_nop isn't appropriate for arbitrary expressions (classes, in
particular).  We probably want to factor out the code in
cxx_eval_outermost_constant_expr under the "this isn't actually constant"
comment.


Gotcha, makes sense.  Like the following?  Bootstrapped and regtested on
x86_64-pc-linux-gnu.


OK.


-- >8 --

Subject: [PATCH] c++: ICE with non-constant satisfaction value [PR98644]

Here during satisfaction, the expression of the atomic constraint after
substitution is (int *) NON_LVALUE_EXPR <1> != 0B, which is not a C++
constant expression due to the reinterpret_cast, but TREE_CONSTANT is
set since its value is otherwise effectively constant.  We then call
maybe_constant_value on it, which proceeds via its fail-fast path to
exit early without clearing TREE_CONSTANT.  But satisfy_atom relies
on checking TREE_CONSTANT 

Re: [PATCH] c++: return-type-req in constraint using only outer tparms [PR104527]

2022-03-11 Thread Jason Merrill via Gcc-patches

On 3/10/22 16:57, Patrick Palka wrote:


On Thu, 10 Mar 2022, Jason Merrill wrote:


On 2/16/22 15:56, Patrick Palka wrote:

On Tue, 15 Feb 2022, Jason Merrill wrote:


On 2/14/22 11:32, Patrick Palka wrote:

Here the template context for the atomic constraint has two levels of
template arguments, but since it depends only on the innermost argument
T we use a single-level argument vector during substitution into the
constraint (built by get_mapped_args).  We eventually pass this vector
to do_auto_deduction as part of checking the return-type-requirement
inside the atom, but do_auto_deduction expects outer_targs to be a full
set of arguments for sake of satisfaction.


Could we note the current number of levels in the map and use that in
get_mapped_args instead of the highest level parameter we happened to use?


Ah yeah, that seems to work nicely.  IIUC it should suffice to remember
whether the atomic constraint expression came from a concept definition.
If it did, then the depth of the argument vector returned by
get_mapped_args must be one, otherwise (as in the testcase) it must be
the same as the template depth of the constrained entity, which is the
depth of ARGS.

How does the following look?  Bootstrapped and regtested on
x86_64-pc-linux-gnu and also on cmcstl2 and range-v3.

-- >8 --

Subject: [PATCH] c++: return-type-req in constraint using only outer tparms
   [PR104527]

Here the template context for the atomic constraint has two levels of
template parameters, but since it depends only on the innermost parameter
T we use a single-level argument vector (built by get_mapped_args) during
substitution into the atom.  We eventually pass this vector to
do_auto_deduction as part of checking the return-type-requirement within
the atom, but do_auto_deduction expects outer_targs to be a full set of
arguments for sake of satisfaction.

This patch fixes this by making get_mapped_args always return an
argument vector whose depth corresponds to the template depth of the
context in which the atomic constraint expression was written, instead
of the highest parameter level that the expression happens to use.

PR c++/104527

gcc/cp/ChangeLog:

* constraint.cc (normalize_atom): Set
ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P appropriately.
(get_mapped_args):  Make static, adjust parameters.  Always
return a vector whose depth corresponds to the template depth of
the context of the atomic constraint expression.  Micro-optimize
by passing false as exact to safe_grow_cleared and by collapsing
a multi-level depth-one argument vector.
(satisfy_atom): Adjust call to get_mapped_args and
diagnose_atomic_constraint.
(diagnose_atomic_constraint): Replace map parameter with an args
parameter.
* cp-tree.h (ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P): Define.
(get_mapped_args): Remove declaration.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-return-req4.C: New test.
---
   gcc/cp/constraint.cc  | 64 +++
   gcc/cp/cp-tree.h  |  7 +-
   .../g++.dg/cpp2a/concepts-return-req4.C   | 24 +++
   3 files changed, 69 insertions(+), 26 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-return-req4.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 12db7e5cf14..306e28955c6 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -764,6 +764,8 @@ normalize_atom (tree t, tree args, norm_info info)
 tree ci = build_tree_list (t, info.context);
   tree atom = build1 (ATOMIC_CONSTR, ci, map);
+  if (info.in_decl && concept_definition_p (info.in_decl))
+ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P (atom) = true;


I'm a bit nervous about relying on in_decl, given that we support normalizing
when it isn't set; I don't remember the circumstances for that.  Maybe make
the flag indicate that ctx_parms had depth 1?


in_decl gets reliably updated by norm_info::update_context whenever we
recurse inside a concept-id during normalization.  And I think the only
other situation we have to worry about is when starting out with a
concept-id, which is handled by normalize_concept_definition where we
also set in_decl appropriately.

AFAICT, in_decl is not set (at the start) only when normalizing a
placeholder type constraint or nested-requirement, and from some
subsumption entrypoints.  And we shouldn't see an atom that belongs to a
concept in these cases unless we recurse into a concept-id, in which
case norm_info::update_context will update in_decl appropriately.

So IMHO it should be safe to rely on in_decl here to detect if the atom
belongs to a concept, at least given the current entrypoints to
subsumption/satisfaction..


Sounds good; please put a bit of that explanation in a comment where you 
set the flag.  OK with that change.





 if (!info.generate_diagnostics ())
   {
 /* Cache the ATOMIC_CONSTRs that we ret

[r12-7616 Regression] FAIL: libitm.c/memset-1.c (test for excess errors) on Linux/x86_64

2022-03-11 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

251ea6dfbdb4448875e41081682bb3aa451b5729 is the first bad commit
commit 251ea6dfbdb4448875e41081682bb3aa451b5729
Author: Roger Sayle 
Date:   Fri Mar 11 17:57:12 2022 +

PR tree-optimization/98335: New peephole2 xorl;movb -> movzbl

caused

FAIL: libitm.c/memcpy-1.c (internal compiler error: in extract_insn, at 
recog.cc:2769)
FAIL: libitm.c/memcpy-1.c (test for excess errors)
FAIL: libitm.c/memset-1.c (internal compiler error: in extract_insn, at 
recog.cc:2769)
FAIL: libitm.c/memset-1.c (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-7616/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libitm/testsuite && make check 
RUNTESTFLAGS="c.exp=libitm.c/memcpy-1.c --target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libitm/testsuite && make check 
RUNTESTFLAGS="c.exp=libitm.c/memcpy-1.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libitm/testsuite && make check 
RUNTESTFLAGS="c.exp=libitm.c/memset-1.c --target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libitm/testsuite && make check 
RUNTESTFLAGS="c.exp=libitm.c/memset-1.c --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[committed] d: Cache generated import declarations in a hash_map

2022-03-11 Thread Iain Buclaw via Gcc-patches
Hi,

This patch refactors the ImportVisitor to cache the generated result
decl in a hash_map.  Originally, these were cached in the front-end AST
node field `isym'.  However, this field is soon to be removed.

Bootstrapped and regression tested on x86_64-linux-gnu/m32/mx32, and
committed to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* imports.cc (imported_decls): Define.
(class ImportVisitor): Add result_ field.
(ImportVisitor::result): New method.
(ImportVisitor::visit (Module *)): Store decl to result_.
(ImportVisitor::visit (Import *)): Likewise.
(ImportVisitor::visit (AliasDeclaration *)): Don't cache decl in
front-end AST node.
(ImportVisitor::visit (OverDeclaration *)): Likewise.
(ImportVisitor::visit (FuncDeclaration *)): Likewise.
(ImportVisitor::visit (Declaration *)): Likewise.
(build_import_decl): Use imported_decls to cache and lookup built
declarations.
---
 gcc/d/imports.cc | 77 ++--
 1 file changed, 41 insertions(+), 36 deletions(-)

diff --git a/gcc/d/imports.cc b/gcc/d/imports.cc
index d3a3099ce76..29c0fbfe6d2 100644
--- a/gcc/d/imports.cc
+++ b/gcc/d/imports.cc
@@ -31,14 +31,17 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "d-tree.h"
 
+static hash_map *imported_decls;
 
 /* Implements the visitor interface to build debug trees for all
-   module and import declarations, where ISYM holds the cached
-   back-end representation to be returned.  */
+   module and import declarations, where RESULT_ holds the back-end
+   representation to be cached and returned from the caller.  */
 class ImportVisitor : public Visitor
 {
   using Visitor::visit;
 
+  tree result_;
+
   /* Build the declaration DECL as an imported symbol.  */
   tree make_import (tree decl)
   {
@@ -55,6 +58,12 @@ class ImportVisitor : public Visitor
 public:
   ImportVisitor (void)
   {
+this->result_ = NULL_TREE;
+  }
+
+  tree result (void)
+  {
+return this->result_;
   }
 
   /* This should be overridden by each symbol class.  */
@@ -70,16 +79,16 @@ public:
 Loc loc = (m->md != NULL) ? m->md->loc
   : Loc (m->srcfile.toChars (), 1, 0);
 
-m->isym = build_decl (make_location_t (loc), NAMESPACE_DECL,
- get_identifier (m->toPrettyChars ()),
- void_type_node);
-d_keep (m->isym);
+this->result_ = build_decl (make_location_t (loc), NAMESPACE_DECL,
+   get_identifier (m->toPrettyChars ()),
+   void_type_node);
+d_keep (this->result_);
 
 if (!m->isRoot ())
-  DECL_EXTERNAL (m->isym) = 1;
+  DECL_EXTERNAL (this->result_) = 1;
 
-TREE_PUBLIC (m->isym) = 1;
-DECL_CONTEXT (m->isym) = NULL_TREE;
+TREE_PUBLIC (this->result_) = 1;
+DECL_CONTEXT (this->result_) = NULL_TREE;
   }
 
   /* Build an import of another module symbol.  */
@@ -87,7 +96,7 @@ public:
   void visit (Import *m)
   {
 tree module = build_import_decl (m->mod);
-m->isym = this->make_import (module);
+this->result_ = this->make_import (module);
   }
 
   /* Build an import for any kind of user defined type.
@@ -141,20 +150,14 @@ public:
 
 /* This symbol is really an alias for another, visit the other.  */
 if (dsym != d)
-  {
-   dsym->accept (this);
-   d->isym = dsym->isym;
-  }
+  dsym->accept (this);
   }
 
   /* Visit the underlying alias symbol of overloadable aliases.  */
   void visit (OverDeclaration *d)
   {
 if (d->aliassym != NULL)
-  {
-   d->aliassym->accept (this);
-   d->isym = d->aliassym->isym;
-  }
+  d->aliassym->accept (this);
   }
 
   /* Function aliases are the same as alias symbols.  */
@@ -163,10 +166,7 @@ public:
 FuncDeclaration *fd = d->toAliasFunc ();
 
 if (fd != NULL)
-  {
-   fd->accept (this);
-   d->isym = fd->isym;
-  }
+  fd->accept (this);
   }
 
   /* Skip over importing templates and tuples.  */
@@ -182,7 +182,7 @@ public:
  symbol generation routines, the compiler will throw an error.  */
   void visit (Declaration *d)
   {
-d->isym = this->make_import (get_symbol_decl (d));
+this->result_ = this->make_import (get_symbol_decl (d));
   }
 };
 
@@ -192,17 +192,22 @@ public:
 tree
 build_import_decl (Dsymbol *d)
 {
-  if (!d->isym)
-{
-  location_t saved_location = input_location;
-  ImportVisitor v;
-
-  input_location = make_location_t (d->loc);
-  d->accept (&v);
-  input_location = saved_location;
-}
-
-  /* Not all visitors set `isym'.  */
-  return d->isym ? d->isym : NULL_TREE;
-}
+  hash_map_maybe_create (imported_decls);
+
+  if (tree *decl = imported_decls->get (d))
+return *decl;
 
+  location_t saved_location = input_location;
+  ImportVisitor v = ImportVisitor ();
+
+  input_location = make_location_t (d->loc);
+  d->accept (&v);
+  input_location = saved_loca

[committed] d: Fix mistakes in strings to be translated [PR104552]

2022-03-11 Thread Iain Buclaw via Gcc-patches
Hi,

This patch addresses comments made in PR104552 about documented D
language options.

Bootstrapped and committed to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR translation/104552
* lang.opt (fdump-cxx-spec=): Fix typo in argument handle.
(fpreview=fixaliasthis): Quote `alias this' as code.
---
 gcc/d/lang.opt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/d/lang.opt b/gcc/d/lang.opt
index 491797a1b6b..7859e1583c8 100644
--- a/gcc/d/lang.opt
+++ b/gcc/d/lang.opt
@@ -277,7 +277,7 @@ Add comments for ignored declarations in the generated C++ 
header.
 
 fdump-c++-spec=
 D RejectNegative Joined
--fdump-cxx-spec= Write all declarations as C++ code to .
+-fdump-cxx-spec= Write all declarations as C++ code to 
.
 
 fdump-d-original
 D
@@ -370,7 +370,7 @@ Use field-wise comparisons for struct equality.
 
 fpreview=fixaliasthis
 D RejectNegative
-When a symbol is resolved, check alias this scope before going to upper scopes.
+When a symbol is resolved, check `alias this' scope before going to upper 
scopes.
 
 fpreview=in
 D RejectNegative
-- 
2.32.0



Re: [PATCH] Fix DImode to TImode sign extend issue, PR target/104868

2022-03-11 Thread Michael Meissner via Gcc-patches
On Fri, Mar 11, 2022 at 02:41:05PM -0600, Segher Boessenkool wrote:
> On Fri, Mar 11, 2022 at 01:07:29AM -0500, Michael Meissner wrote:
> > Fix DImode to TImode sign extend issue, PR target/104898
> 
> > When I wrote the extendditi2 pattern, I forgot that mtvsrdd had that
> > behavior so I used a 'r' constraint instead of 'b'.  In the rare case
> > where the value is in GPR register 0, this split will fail.
> 
> Note that the machine instructions it would generate would work fine:
> mtvsrdd X,0,Y can be used as a "mtvsrld" always.  In fact, generating
> such code would be better than mtvsrdd always here.
> 
> Do you want to try that?  If not, this is okay for trunk.  Thanks!

Right now, it will need support (since we don't have a zero_extendditi2 pattern
right now).  I am working on optimizations to do this, but right now it is
simplest just to use "b".

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-03-11 Thread Michael Meissner via Gcc-patches
On Fri, Mar 11, 2022 at 02:51:23PM -0600, Segher Boessenkool wrote:
> On Fri, Mar 11, 2022 at 08:42:27PM +, Joseph Myers wrote:
> > The version of this patch applied to GCC 10 branch (commit 
> > 641b407763ecfee5d4ac86d8ffe9eb1eeea5fd10) has broken the glibc build for 
> > powerpc64le-linux-gnu (it's fine with GCC 11 branch and master, just GCC 
> > 10 branch is broken) 
> 
> Mike, please revert it then?

Ok, I will revert both the GCC 11 and GCC 10 backport once I make sure the fix
builds.  Sorry about that.  Obviously, we will want to backport whatever we do
shortly to the older branches.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[committed] wwwdocs: gcc5: Remove broken link to Intel ISA extensions

2022-03-11 Thread Gerald Pfeifer
I doubt anyone is using the GCC 5 release notes to get to that page,
and the link broke without a proper redirect, so make it a textual
reference (only).

Pushed.

Gerald

---
 htdocs/gcc-5/changes.html | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/htdocs/gcc-5/changes.html b/htdocs/gcc-5/changes.html
index 2e2e20e6..6f5b9f64 100644
--- a/htdocs/gcc-5/changes.html
+++ b/htdocs/gcc-5/changes.html
@@ -779,8 +779,7 @@ here.
 
 IA-32/x86-64
   
-New https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf";>ISA
 extensions
+New ISA extensions
support AVX-512{BW,DQ,VL,IFMA,VBMI} of Intel's CPU
codenamed Skylake Server was added to GCC.  That includes inline
assembly support, new intrinsics, and basic autovectorization.  These
-- 
2.35.1


[committed] wwwdocs: sched-treegion.html: Move prod.tinker.cc.gatech.edu to https

2022-03-11 Thread Gerald Pfeifer
Just following server redirects - http to https.

Pushed.

Gerald
---
 htdocs/projects/sched-treegion.html | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/htdocs/projects/sched-treegion.html 
b/htdocs/projects/sched-treegion.html
index d421d87b..d5cefa03 100644
--- a/htdocs/projects/sched-treegion.html
+++ b/htdocs/projects/sched-treegion.html
@@ -162,7 +162,7 @@ rules apply.  This branch is maintained by
 Readings
 
 Lots of useful information is present at the http://prod.tinker.cc.gatech.edu";>TINKER Microarchitecture and
+href="https://prod.tinker.cc.gatech.edu";>TINKER Microarchitecture and
 Compiler Research homepage. More relevant papers:
 
 
@@ -170,7 +170,7 @@ Compiler Research homepage. More relevant papers:
 
 
 H. Zhou, and T.M. Conte, 
-http://prod.tinker.cc.gatech.edu/symposia/interact02.pdf";>
+https://prod.tinker.cc.gatech.edu/symposia/interact02.pdf";>
 Code Size Efficiency in Global Scheduling for ILP Processors,
 Proceedings of the 6th Annual Workshop on the Interaction between Compilers 
 and Computer Architectures (INTERACT-6), Cambridge, MA, February 2002.
@@ -180,7 +180,7 @@ and Computer Architectures (INTERACT-6), Cambridge, MA, 
February 2002.
 
 
 H. Zhou, M. D. Jennings, and T. M. Conte,
-http://prod.tinker.cc.gatech.edu/symposia/lcpc01.pdf";>
+https://prod.tinker.cc.gatech.edu/symposia/lcpc01.pdf";>
 Tree Traversal Scheduling: A Global Scheduling Technique for VLIW/EPIC 
 Processors, Proceedings of the 14th Annual Workshop on Languages and 
 Compilers for Parallel Computing (LCPC'01), Cumberland Falls, KY, August 2001.
@@ -190,7 +190,7 @@ Compilers for Parallel Computing (LCPC'01), Cumberland 
Falls, KY, August 2001.
 
 
 W. A. Havanki, S. Banerjia, and T. M. Conte,
-http://prod.tinker.cc.gatech.edu/symposia/hpca4_treegions.pdf";>
+https://prod.tinker.cc.gatech.edu/symposia/hpca4_treegions.pdf";>
 Treegion scheduling for wide-issue processors, Proceedings of the 
 4th International Symposium on High-Performance Computer Architecture 
 (HPCA-4), Las Vegas, Feb. 1998.
@@ -200,7 +200,7 @@ Treegion scheduling for wide-issue processors, 
Proceedings of the
 
 
 S. Banerjia, W.A. Havanki, and T.M. Conte,
-http://prod.tinker.cc.gatech.edu/symposia/europar97.pdf";>
+https://prod.tinker.cc.gatech.edu/symposia/europar97.pdf";>
 Treegion scheduling for highly parallel processors, 
 Proceedings of the 3rd International Euro-Par Conference (Euro-Par'97), 
 Passau, Germany, pp.1074-1078, Aug. 1997.
-- 
2.35.1


Re: [PATCH] c++: Fix ICE with bad conversion shortcutting [PR104622]

2022-03-11 Thread Jason Merrill via Gcc-patches

On 3/10/22 15:30, Patrick Palka wrote:

When shortcutting bad conversions during overload resolution, we assume
argument conversions get computed in sequential order and that therefore
we just need to inspect the last conversion in order to determine if _any_
conversion is missing.  But this assumption turns out to be false for
templates, because during deduction check_non_deducible_conversion can
compute an argument conversion out of order.

So in the testcase below, at the end of add_template_candidate the convs
array looks like {bad, missing, good} where the last conversion was
computed during deduction and the first was computed later from
add_function_candidate.  We need to add this candidate to BAD_FNS since
not all of its argument conversions were computed, but we don't do so
because we only checked if the last argument conversion was missing.

This patch fixes this by checking for a missing conversion exhaustively.
In passing, this cleans up check_non_deducible_conversion a bit since
AFAICT the only values of strict we expect to see here are the three
enumerators of unification_kind_t.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


PR c++/104622

gcc/cp/ChangeLog:

* call.cc (missing_conversion_p): Define.
(add_candidates): Use it.
* pt.cc (check_non_deducible_conversion): Change type of strict
parameter to unification_kind_t and directly test for DEDUCE_CALL.

gcc/testsuite/ChangeLog:

* g++.dg/template/conv18.C: New test.
---
  gcc/cp/call.cc | 13 -
  gcc/cp/pt.cc   |  6 +++---
  gcc/testsuite/g++.dg/template/conv18.C | 14 ++
  3 files changed, 29 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/conv18.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index d6eed5ed835..8fe8ef306ea 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -6023,6 +6023,17 @@ perfect_candidate_p (z_candidate *cand)
return true;
  }
  
+/* True iff one of CAND's argument conversions is NULL.  */

+
+static bool
+missing_conversion_p (const z_candidate *cand)
+{
+  for (unsigned i = 0; i < cand->num_convs; ++i)
+if (!cand->convs[i])
+  return true;
+  return false;
+}
+
  /* Add each of the viable functions in FNS (a FUNCTION_DECL or
 OVERLOAD) to the CANDIDATES, returning an updated list of
 CANDIDATES.  The ARGS are the arguments provided to the call;
@@ -6200,7 +6211,7 @@ add_candidates (tree fns, tree first_arg, const vec *args,
  
if (cand->viable == -1

  && shortcut_bad_convs
- && !cand->convs[cand->reversed () ? 0 : cand->num_convs - 1])
+ && missing_conversion_p (cand))
{
  /* This candidate has been tentatively marked non-strictly viable,
 and we didn't compute all argument conversions for it (having
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index f890d92d715..715eea27577 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -152,7 +152,7 @@ static tree coerce_innermost_template_parms (tree, tree, 
tree, tsubst_flags_t,
  bool, bool);
  static void tsubst_enum   (tree, tree, tree);
  static bool check_instantiated_args (tree, tree, tsubst_flags_t);
-static int check_non_deducible_conversion (tree, tree, int, int,
+static int check_non_deducible_conversion (tree, tree, unification_kind_t, int,
   struct conversion **, bool);
  static int maybe_adjust_types_for_deduction (tree, unification_kind_t,
 tree*, tree*, tree);
@@ -22304,7 +22304,7 @@ maybe_adjust_types_for_deduction (tree tparms,
 unify_one_argument.  */
  
  static int

-check_non_deducible_conversion (tree parm, tree arg, int strict,
+check_non_deducible_conversion (tree parm, tree arg, unification_kind_t strict,
int flags, struct conversion **conv_p,
bool explain_p)
  {
@@ -22324,7 +22324,7 @@ check_non_deducible_conversion (tree parm, tree arg, 
int strict,
if (can_convert_arg (type, parm, NULL_TREE, flags, complain))
return unify_success (explain_p);
  }
-  else if (strict != DEDUCE_EXACT)
+  else if (strict == DEDUCE_CALL)
  {
bool ok = false;
tree conv_arg = TYPE_P (arg) ? NULL_TREE : arg;
diff --git a/gcc/testsuite/g++.dg/template/conv18.C 
b/gcc/testsuite/g++.dg/template/conv18.C
new file mode 100644
index 000..f59f6fda77c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/conv18.C
@@ -0,0 +1,14 @@
+// PR c++/104622
+// { dg-additional-options "-fpermissive" }
+
+template
+struct type_identity {
+  typedef T type;
+};
+
+template void f(typename type_identity::type*, T, int*);
+
+int main() {
+  const int p = 0;
+  f(&p, 0, 0); // { dg-warning "invalid conversion" }
+}




Re: [PATCH] c++: ICE with template code in constexpr [PR104284]

2022-03-11 Thread Jason Merrill via Gcc-patches

On 3/10/22 18:04, Marek Polacek wrote:

Since r9-6073 cxx_eval_store_expression preevaluates the value to
be stored, and that revealed a crash where a template code (here,
code=IMPLICIT_CONV_EXPR) leaks into cxx_eval*.

It happens because we're performing build_vec_init while processing
a template


Hmm, that seems like the bug.  Where's that call coming from?


which calls get_temp_regvar which creates an INIT_EXPR.
This INIT_EXPR's RHS contains an rvalue conversion so we create an
IMPLICIT_CONV_EXPR.  Its operand is not type-dependent and the whole
INIT_EXPR is not type-dependent.  So we call build_non_dependent_expr
which, with -fchecking=2, calls fold_non_dependent_expr.  At this
point the expression still has an IMPLICIT_CONV_EXPR, which ought to
be handled in instantiate_non_dependent_expr_internal.  However,
tsubst_copy_and_build doesn't handle INIT_EXPR; it will just call
tsubst_copy which does nothing when args is null.  So we fail to
replace the IMPLICIT_CONV_EXPR and ICE.

Eliding the IMPLICIT_CONV_EXPR in this particular case would be too
risky, so we could do

   if (TREE_CODE (t) == INIT_EXPR)
 t = TREE_OPERAND (t, 1);

in fold_non_dependent_expr, but that feels too ad hoc.  So it might
make sense to actually take care of INIT_EXPR in tsubst_c_and_b.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11?

PR c++/104284

gcc/cp/ChangeLog:

* pt.cc (tsubst_copy_and_build): Handle INIT_EXPR.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-104284.C: New test.
---
  gcc/cp/pt.cc  |  8 
  gcc/testsuite/g++.dg/cpp1y/constexpr-104284.C | 17 +
  2 files changed, 25 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-104284.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index f7ee33a6dfd..e8920f98e4d 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -21289,6 +21289,14 @@ tsubst_copy_and_build (tree t,
 with constant operands.  */
RETURN (t);
  
+case INIT_EXPR:

+  {
+   tree op0 = RECUR (TREE_OPERAND (t, 0));
+   tree op1 = RECUR (TREE_OPERAND (t, 1));
+   RETURN (build2_loc (input_location, INIT_EXPR, TREE_TYPE (op0),
+   op0, op1));
+  }
+
  case NON_LVALUE_EXPR:
  case VIEW_CONVERT_EXPR:
if (location_wrapper_p (t))
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-104284.C 
b/gcc/testsuite/g++.dg/cpp1y/constexpr-104284.C
new file mode 100644
index 000..f60033069e4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-104284.C
@@ -0,0 +1,17 @@
+// PR c++/104284
+// { dg-do compile { target c++14 } }
+// { dg-additional-options "-fchecking=2" }
+
+struct S {
+  char c{};
+};
+
+auto x = [](auto) { constexpr S s[]{{}}; };
+
+template
+constexpr void gn ()
+{
+  constexpr S s[]{{}};
+}
+
+static_assert ((gn(), true), "");

base-commit: b5417a0ba7e26bec2abf05cad6c6ef840a9be41c




[PATCH] PR tree-optimization/101895: Fold VEC_PERM to help recognize FMA.

2022-03-11 Thread Roger Sayle

This patch resolves PR tree-optimization/101895 a missed optimization
regression, by adding a constant folding simplification to match.pd to
simplify the transform "mult; vec_perm; plus" into "vec_perm; mult; plus"
with the aim that keeping the multiplication and addition next to each
other allows them to be recognized as fused-multiply-add on suitable
targets.  This transformation requires a tweak to match.pd's
vec_same_elem_p predicate to handle CONSTRUCTOR_EXPRs using the same
SSA_NAME_DEF_STMT idiom used for constructors elsewhere in match.pd.

The net effect is that the following code example:

void foo(float * __restrict__ a, float b, float *c) {
  a[0] = c[0]*b + a[0];
  a[1] = c[2]*b + a[1];
  a[2] = c[1]*b + a[2];
  a[3] = c[3]*b + a[3];
}

when compiled on x86_64-pc-linux-gnu with -O2 -march=cascadelake
currently generates:

vbroadcastss%xmm0, %xmm0
vmulps  (%rsi), %xmm0, %xmm0
vpermilps   $216, %xmm0, %xmm0
vaddps  (%rdi), %xmm0, %xmm0
vmovups %xmm0, (%rdi)
ret

but with this patch now generates the improved:

vpermilps   $216, (%rsi), %xmm1
vbroadcastss%xmm0, %xmm0
vfmadd213ps (%rdi), %xmm0, %xmm1
vmovups %xmm1, (%rdi)
ret

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?


2022-03-11  Roger Sayle  

gcc/ChangeLog
PR tree-optimization/101895
* match.pd (vec_same_elem_p): Handle CONSTRUCTOR_EXPR def.
(plus (vec_perm (mult ...) ...) ...): New reordering simplification.

gcc/testsuite/ChangeLog
* gcc.target/i386/pr101895.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/match.pd b/gcc/match.pd
index 97399e5..9184276 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7695,10 +7695,22 @@ and,
 (match vec_same_elem_p
  (vec_duplicate @0))
 
+(match vec_same_elem_p
+  CONSTRUCTOR@0
+  (if (uniform_vector_p (TREE_CODE (@0) == SSA_NAME
+? gimple_assign_rhs1 (SSA_NAME_DEF_STMT (@0)) : @0
+
 (simplify
  (vec_perm vec_same_elem_p@0 @0 @1)
  @0)
 
+/* Push VEC_PERM earlier if that may help FMA perception (PR101895).  */
+(for plusminus (plus minus)
+  (simplify
+(plusminus (vec_perm (mult@0 @1 vec_same_elem_p@2) @0 @3) @4)
+(plusminus (mult (vec_perm @1 @1 @3) @2) @4)))
+  
+
 /* Match count trailing zeroes for simplify_count_trailing_zeroes in fwprop.
The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic
constant which when multiplied by a power of 2 contains a unique value
diff --git a/gcc/testsuite/gcc.target/i386/pr101895.c 
b/gcc/testsuite/gcc.target/i386/pr101895.c
new file mode 100644
index 000..4d0f1cb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101895.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=cascadelake" } */
+
+void foo(float * __restrict__ a, float b, float *c) {
+  a[0] = c[0]*b + a[0];
+  a[1] = c[2]*b + a[1];
+  a[2] = c[1]*b + a[2];
+  a[3] = c[3]*b + a[3];
+}
+
+/* { dg-final { scan-assembler "vfmadd" } } */


Re: [PATCH] c++: naming a dependently-scoped template for CTAD [PR104641]

2022-03-11 Thread Jason Merrill via Gcc-patches

On 3/10/22 12:41, Patrick Palka wrote:

On Wed, 9 Mar 2022, Jason Merrill wrote:


On 3/9/22 10:39, Patrick Palka wrote:

On Tue, 8 Mar 2022, Jason Merrill wrote:


On 3/2/22 14:32, Patrick Palka wrote:

In order to be able to perform CTAD for a dependently-scoped template
such as A::B in the testcase below, we need to permit a
typename-specifier to resolve to a template as per [dcl.type.simple]/2,
at least when it appears in a CTAD-enabled context.

This patch implements this using a new tsubst flag tf_tst_ok to control
when a TYPENAME_TYPE is allowed to name a template, and sets this flag
when substituting into the type of a CAST_EXPR, CONSTRUCTOR or VAR_DECL
(each of which is a CTAD-enabled context).


What breaks if we always allow that, or at least in -std that support
CTAD?


AFAICT no significant breakage, but some accepts-invalid and diagnostic
regressions crop up, e.g. accepts-invalid for

using type = typename A::B; // no more diagnostic if typename resolves
to a
   // template at instantiation time

and diagnostic regression for

template::B> void f();
// no more elaboration why deduction failed if typename resolves
// to a template


Ah, sure, the cost is that we would need to check for this case in various
callers, rather than setting a flag in a different set of callers.  Fair
enough.


Yes exactly, and presumably the set of callers for which typename is
permitted to resolve to a template is much smaller, so making the
behavior opt-in instead of opt-out seems more desirable.  Alternatively
we could add a new flag to TYPENAME_TYPE set carefully at parse time
that controls this behavior, but seems overall simpler to not use a
new tree flag if we can get away with it.




@@ -16229,6 +16237,12 @@ tsubst (tree t, tree args, tsubst_flags_t complain,
tree in_decl)
  }
  }
  + if (TREE_CODE (f) == TEMPLATE_DECL)
+ {
+   gcc_checking_assert (tst_ok);
+   f = make_template_placeholder (f);
+ }


How about calling make_template_placeholder in make_typename_type?


That works nicely too, like so?

-- >8 --

Subject: [PATCH] c++: naming a dependently-scoped template for CTAD [PR104641]

In order to be able to perform CTAD for a dependently-scoped template
(such as A::B in the testcase below), we need to permit a
typename-specifier to resolve to a template as per [dcl.type.simple]/3,
at least when it appears in a CTAD-enabled context.

This patch implements this using a new tsubst flag tf_tst_ok to control
when a TYPENAME_TYPE is allowed to name a template, and sets this flag
when substituting into the type of a CAST_EXPR, CONSTRUCTOR or VAR_DECL
(each of which is a CTAD-enabled context).

PR c++/104641

gcc/cp/ChangeLog:

* cp-tree.h (tsubst_flags::tf_tst_ok): New flag.
* decl.cc (make_typename_type): Allow a typename-specifier to
resolve to a template when tf_tst_ok, in which case return
a CTAD placeholder for the template.
* pt.cc (tsubst_decl) : Set tf_tst_ok when
substituting the type.
(tsubst): Clear tf_tst_ok and remember if it was set.
: Pass tf_tst_ok to make_typename_type
appropriately.
(tsubst_copy) : Set tf_tst_ok when substituting
the type.
(tsubst_copy_and_build) : Likewise.
: Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction107.C: New test.
---
  gcc/cp/cp-tree.h  |  2 ++
  gcc/cp/decl.cc| 20 ++---
  gcc/cp/pt.cc  | 29 +++
  .../g++.dg/cpp1z/class-deduction107.C | 20 +
  4 files changed, 61 insertions(+), 10 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction107.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index b71bce1ab97..b7606f22287 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5557,6 +5557,8 @@ enum tsubst_flags {
(build_target_expr and friends) */
tf_norm = 1 << 11, /* Build diagnostic information during
constraint normalization.  */
+  tf_tst_ok = 1 << 12,/* Allow a typename-specifier to name
+   a template.  */
/* Convenient substitution flags combinations.  */
tf_warning_or_error = tf_warning | tf_error
  };
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 992e38385c2..d2d46915068 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -4204,16 +4204,28 @@ make_typename_type (tree context, tree name, enum 
tag_types tag_type,
  }
if (!want_template && TREE_CODE (t) != TYPE_DECL)
  {
-  if (complain & tf_error)
-   error ("% names %q#T, which is not a type",
-  context, name, t);
-  return error_mark_node;
+  if ((complain & tf_tst_ok) && DECL_TYPE_TEMPLATE_P (t))
+   /* The caller permits this

Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-03-11 Thread Michael Meissner via Gcc-patches
On Fri, Mar 11, 2022 at 03:07:39PM -0600, Segher Boessenkool wrote:
> On Fri, Mar 11, 2022 at 09:57:50PM +0100, Jakub Jelinek wrote:
> > On Fri, Mar 11, 2022 at 02:51:23PM -0600, Segher Boessenkool wrote:
> > > On Fri, Mar 11, 2022 at 08:42:27PM +, Joseph Myers wrote:
> > > > The version of this patch applied to GCC 10 branch (commit 
> > > > 641b407763ecfee5d4ac86d8ffe9eb1eeea5fd10) has broken the glibc build 
> > > > for 
> > > > powerpc64le-linux-gnu (it's fine with GCC 11 branch and master, just 
> > > > GCC 
> > > > 10 branch is broken) 
> > > 
> > > Mike, please revert it then?
> > 
> > Preferably also the GCC 11 commit, because otherwise it needs backports
> > of the follow-up changes too.
> 
> Good point.  Yes please.

Both GCC 10 and GCC 11 reverted.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH] c++: fold calls to std::move/forward [PR96780]

2022-03-11 Thread Jason Merrill via Gcc-patches

On 3/10/22 11:27, Patrick Palka wrote:

On Wed, 9 Mar 2022, Jason Merrill wrote:


On 3/1/22 18:08, Patrick Palka wrote:

A well-formed call to std::move/forward is equivalent to a cast, but the
former being a function call means it comes with bloated debug info, which
persists even after the call has been inlined away, for an operation that
is never interesting to debug.

This patch addresses this problem in a relatively ad-hoc way by folding
calls to std::move/forward into casts as part of the frontend's general
expression folding routine.  After this patch with -O2 and a non-checking
compiler, debug info size for some testcases decreases by about ~10% and
overall compile time and memory usage decreases by ~2%.


Impressive.  Which testcases?


I saw the largest percent reductions in debug file object size in
various tests from cmcstl2 and range-v3, e.g.
test/algorithm/set_symmetric_difference4.cpp and .../rotate_copy.cpp
(which are among their biggest tests).

Significant reductions in debug object file size can be observed in
some libstdc++ testcases too, such as a 5.5% reduction in
std/ranges/adaptor/join.cc



Do you also want to handle addressof and as_const in this patch, as Jonathan
suggested?


Yes, good idea.  Since each of their argument and return types are
indirect types, I think we can use the same NOP_EXPR-based folding for
them.



I think we can do this now, and think about generalizing more in stage 1.


Bootstrapped and regtested on x86_64-pc-linux-gnu, is this something we
want to consider for GCC 12?

PR c++/96780

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold) : When optimizing,
fold calls to std::move/forward into simple casts.
* cp-tree.h (is_std_move_p, is_std_forward_p): Declare.
* typeck.cc (is_std_move_p, is_std_forward_p): Export.

gcc/testsuite/ChangeLog:

* g++.dg/opt/pr96780.C: New test.
---
   gcc/cp/cp-gimplify.cc  | 18 ++
   gcc/cp/cp-tree.h   |  2 ++
   gcc/cp/typeck.cc   |  6 ++
   gcc/testsuite/g++.dg/opt/pr96780.C | 24 
   4 files changed, 46 insertions(+), 4 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index d7323fb5c09..0b009b631c7 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -2756,6 +2756,24 @@ cp_fold (tree x)
 case CALL_EXPR:
 {
+   if (optimize


I think this should check flag_no_inline rather than optimize.


Sounds good.

Here's a patch that extends the folding to as_const and addressof (as
well as __addressof, which I'm kind of unsure about since it's
non-standard).  I suppose it also doesn't hurt to verify that the return
and argument type of the function are sane before we commit to folding.

-- >8 --

Subject: [PATCH] c++: fold calls to std::move/forward [PR96780]

A well-formed call to std::move/forward is equivalent to a cast, but the
former being a function call means the compiler generates debug info for
it, which persists even after the call has been inlined away, for an
operation that's never interesting to debug.

This patch addresses this problem in a relatively ad-hoc way by folding
calls to std::move/forward and other cast-like functions into simple
casts as part of the frontend's general expression folding routine.
After this patch with -O2 and a non-checking compiler, debug info size
for some testcases decreases by about ~10% and overall compile time and
memory usage decreases by ~2%.

PR c++/96780

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold) : When optimizing,
fold calls to std::move/forward and other cast-like functions
into simple casts.

gcc/testsuite/ChangeLog:

* g++.dg/opt/pr96780.C: New test.
---
  gcc/cp/cp-gimplify.cc  | 36 +++-
  gcc/testsuite/g++.dg/opt/pr96780.C | 38 ++
  2 files changed, 73 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index d7323fb5c09..efc4c8f0eb9 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -2756,9 +2756,43 @@ cp_fold (tree x)
  
  case CALL_EXPR:

{
-   int sv = optimize, nw = sv;
tree callee = get_callee_fndecl (x);
  
+	/* "Inline" calls to std::move/forward and other cast-like functions

+  by simply folding them into the corresponding cast determined by
+  their return type.  This is cheaper than relying on the middle-end
+  to do so, and also means we avoid generating useless debug info for
+  them at all.
+
+  At this point the argument has already been converted into a
+  reference, so it suffices to use a NOP_EXPR to express the
+  cast.  */
+   if (!flag_no_inline


In our conversation yesterday it occurred to me that we might make this 
a

Re: [PATCH] c : Changed warning message for -Wstrict-prototypes [PR92209]

2022-03-11 Thread Eric Gallager via Gcc-patches
On Fri, Mar 11, 2022 at 3:55 PM Joseph Myers  wrote:
>
> On Fri, 11 Mar 2022, Krishna Narayanan via Gcc-patches wrote:
>
> > Hello,
> > The following is a patch for the PR92209,which gives a warning when
> > the function prototype does not specify its argument type.In this
> > patch there has been a change in the warning message displayed for
> > -Wstrict-prototypes to specify its argument types.I have also added
> > the testcase for it.
> > Regtested on x86_64,OK for commit? Please do review it.
>
> Why do you think your proposed wording is better than the existing
> wording?  I think the existing wording is accurate and the proposed
> wording is inaccurate - "must specify the argument types" is not an
> accurate description of any requirement in the C language, using "must" at
> all generally seems questionable in the wording of a warning message.
>
> Also, I don't think this change is anything to do with the PR you mention
> ("Imprecise column number for -Wstrict-prototypes"), so it's wrong to
> mention that PR number in the proposed commit message.
>

The proposed wording comes from one of the comments in the mentioned
PR; see Manu's reply in comment #1:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92209#c1

> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [PATCH] c++: Fix up constexpr evaluation of new with zero sized types [PR104568]

2022-03-11 Thread Jason Merrill via Gcc-patches

On 2/21/22 04:25, Jakub Jelinek wrote:

Hi!

The new expression constant expression evaluation right now tries to
deduce how many elts the array it uses for the heap or heap [] vars
should have (or how many elts should its trailing array have if it has
cookie at the start).  As new is lowered at that point to
(some_type *) ::operator new (size)
or so, it computes it by subtracting cookie size if any from size, then
divides the result by sizeof (some_type).
This works fine for most types, except when sizeof (some_type) is 0,
then we divide by zero; size is then equal to cookie_size (or if there
is no cookie, to 0).
The following patch special cases those cases so that we don't divide
by zero and also recover the original outer_nelts from the expression
by forcing the size not to be folded in that case but be explicit
0 * outer_nelts or cookie_size + 0 * outer_nelts.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, we have further issues, we accept-invalid various cases, for both
zero sized elt_type and even non-zero sized elts, we aren't able to
diagnose out of bounds POINTER_PLUS_EXPR like:
constexpr bool
foo ()
{
   auto p = new int[2];
   auto q1 = &p[0];
   auto q2 = &p[1];
   auto q3 = &p[2];
   auto q4 = &p[3];
   delete[] p;
   return true;
}
constexpr bool a = foo ();
That doesn't look like a regression so I think we should resolve that for
GCC 13, but there are 2 problems.  Figure out why
cxx_fold_pointer_plus_expression doesn't deal with the &heap []
etc. cases, and for the zero sized arrays, I think we really need to preserve
whether user wrote an array ref or pointer addition, because in the
&p[3] case if sizeof(p[0]) == 0 we know that if it has 2 elements it is
out of bounds, while if we see p p+ 0 the information if it was
p + 2 or p + 3 in the source is lost.


But array ref is defined to be equivalent to pointer addition, and we 
also want to handle p+2 properly.  It seems to me that the problem is 
lowering to POINTER_PLUS_EXPR too soon, but that's definitely a stage 1 
project.



clang++ seems to handle it fine even in the zero sized cases or with
new expressions.

2022-02-21  Jakub Jelinek  

PR c++/104568
* cp-tree.h (build_new_constexpr_heap_type): Add FULL_SIZE_ADJUSTED
argument.
* init.cc (build_new_constexpr_heap_type): Add FULL_SIZE_ADJUSTED
argument.  If true, don't subtract csz from it nor divide by
int_size_in_bytes (elt_type).  Don't do that division if
int_size_in_bytes is zero either.
(maybe_wrap_new_for_constexpr): Pass false to
build_new_constexpr_heap_type.
(build_new_1): If size is 0, change it to 0 * outer_nelts if
outer_nelts is non-NULL.  Pass type rather than elt_type to
maybe_wrap_new_for_constexpr.
* constexpr.cc (cxx_eval_constant_expression) :
If elt_size is zero sized type, try to recover outer_nelts from
the size argument to operator new/new[] and pass that as
var_size to build_new_constexpr_heap_type together with true
for the last argument.

* g++.dg/cpp2a/constexpr-new22.C: New test.

--- gcc/cp/cp-tree.h.jj 2022-02-09 20:13:51.541304861 +0100
+++ gcc/cp/cp-tree.h2022-02-17 15:34:30.804453673 +0100
@@ -7038,7 +7038,7 @@ extern tree build_offset_ref  (tree, 
tr
  extern tree throw_bad_array_new_length(void);
  extern bool type_has_new_extended_alignment   (tree);
  extern unsigned malloc_alignment  (void);
-extern tree build_new_constexpr_heap_type  (tree, tree, tree);
+extern tree build_new_constexpr_heap_type  (tree, tree, tree, bool);
  extern tree build_new (location_t,
 vec **, tree,
 tree, vec **,
--- gcc/cp/init.cc.jj   2022-02-05 10:50:05.0 +0100
+++ gcc/cp/init.cc  2022-02-17 15:56:30.010056499 +0100
@@ -2930,7 +2930,8 @@ std_placement_new_fn_p (tree alloc_fn)
 it is computed such that the size of the struct fits into FULL_SIZE.  */
  
  tree

-build_new_constexpr_heap_type (tree elt_type, tree cookie_size, tree full_size)
+build_new_constexpr_heap_type (tree elt_type, tree cookie_size, tree full_size,
+  bool full_size_adjusted)
  {
gcc_assert (cookie_size == NULL_TREE || tree_fits_uhwi_p (cookie_size));
gcc_assert (full_size == NULL_TREE || tree_fits_uhwi_p (full_size));
@@ -2939,9 +2940,14 @@ build_new_constexpr_heap_type (tree elt_
if (full_size)
  {
unsigned HOST_WIDE_INT fsz = tree_to_uhwi (full_size);
-  gcc_assert (fsz >= csz);
-  fsz -= csz;
-  fsz /= int_size_in_bytes (elt_type);
+  unsigned HOST_WIDE_INT esz = int_size_in_bytes (elt_type);
+  if (!full_size_adjusted)
+   {
+ gcc_assert (fsz >= csz);
+ fsz -= csz;
+ if (esz)
+   fsz /= esz;
+   }
itype2 = build_

Re: [PATCH] c++: Reject __builtin_clear_padding on non-trivially-copyable types with one exception [PR102586]

2022-03-11 Thread Jason Merrill via Gcc-patches

On 2/11/22 14:55, Jakub Jelinek wrote:

Hi!

As mentioned by Jason in the PR, non-trivially-copyable types (or non-POD
for purposes of layout?) types can be base classes of derived classes in
which the padding in those non-trivially-copyable types can be redused for
some real data members or even the layout can change and data members can
be moved to other positions.
__builtin_clear_padding is right now used for multiple purposes,
in  where it isn't used yet but was planned as the main spot
it can be used for trivially copyable types only, ditto for std::bit_cast
where we also use it.  It is used for OpenMP long double atomics too but
long double is trivially copyable, and lastly for -ftrivial-auto-var-init=.

The following patch restricts the builtin to pointers to trivially-copyable
types, with the exception when it is called directly on an address of a
variable, in that case already the FE can verify it is the complete object
type and so it is safe to clear all the paddings in it.

Bootstrapped/regtested on powerpc64le-linux, ok for trunk?


OK.


Something like the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102586#c16
will still be needed with adjusted testcase from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102586#c15 such that
__builtin_clear_padding is called directly on var addresses rather than
in separate functions.

2022-02-11  Jakub Jelinek  

PR tree-optimization/102586
gcc/
* doc/extend.texi (__builtin_clear_padding): Clearify that for C++
argument type should be pointer to trivially-copyable type unless it
is address of a variable or parameter.
gcc/cp/
* call.cc (build_cxx_call): Diagnose __builtin_clear_padding where
first argument's type is pointer to non-trivially-copyable type unless
it is address of a variable or parameter.
gcc/testsuite/
* g++.dg/cpp2a/builtin-clear-padding1.C: New test.

--- gcc/doc/extend.texi.jj  2022-02-09 15:16:03.336783697 +0100
+++ gcc/doc/extend.texi 2022-02-11 13:22:39.846157538 +0100
@@ -13993,6 +13993,11 @@ bits that are padding bits for all the u
  This built-in-function is useful if the padding bits of an object might
  have intederminate values and the object representation needs to be
  bitwise compared to some other object, for example for atomic operations.
+
+For C++, @var{ptr} argument type should be pointer to trivially-copyable
+type, unless the argument is address of a variable or parameter, because
+otherwise it isn't known if the type isn't just a base class whose padding
+bits are reused or laid out differently in a derived class.
  @end deftypefn
  
  @deftypefn {Built-in Function} @var{type} __builtin_bit_cast (@var{type}, @var{arg})

--- gcc/cp/call.cc.jj   2022-02-09 20:13:51.523305107 +0100
+++ gcc/cp/call.cc  2022-02-11 12:58:19.168301395 +0100
@@ -10398,6 +10398,27 @@ build_cxx_call (tree fn, int nargs, tree
if (!check_builtin_function_arguments (EXPR_LOCATION (fn), vNULL, 
fndecl,
 orig_fndecl, nargs, argarray))
return error_mark_node;
+  else if (fndecl_built_in_p (fndecl, BUILT_IN_CLEAR_PADDING))
+   {
+ tree arg0 = argarray[0];
+ STRIP_NOPS (arg0);
+ if (TREE_CODE (arg0) == ADDR_EXPR
+ && DECL_P (TREE_OPERAND (arg0, 0))
+ && same_type_ignoring_top_level_qualifiers_p
+   (TREE_TYPE (TREE_TYPE (argarray[0])),
+TREE_TYPE (TREE_TYPE (arg0
+   /* For __builtin_clear_padding (&var) we know the type
+  is for a complete object, so there is no risk in clearing
+  padding that is reused in some derived class member.  */;
+ else if (!trivially_copyable_p (TREE_TYPE (TREE_TYPE (argarray[0]
+   {
+ error_at (EXPR_LOC_OR_LOC (argarray[0], input_location),
+   "argument %u in call to function %qE "
+   "has pointer to a non-trivially-copyable type (%qT)",
+   1, fndecl, TREE_TYPE (argarray[0]));
+ return error_mark_node;
+   }
+   }
  }
  
if (VOID_TYPE_P (TREE_TYPE (fn)))

--- gcc/testsuite/g++.dg/cpp2a/builtin-clear-padding1.C.jj  2022-02-11 
13:13:49.125471991 +0100
+++ gcc/testsuite/g++.dg/cpp2a/builtin-clear-padding1.C 2022-02-11 
13:13:43.403550851 +0100
@@ -0,0 +1,50 @@
+// PR tree-optimization/102586
+// { dg-do compile }
+// { dg-options "-Wno-inaccessible-base" }
+
+struct C0 {};
+struct C1 {};
+struct C2 : C1, virtual C0 {};
+struct C3 : virtual C2, C1 {};
+struct C4 : virtual C3, C1 {};
+struct C5 : C4 {};
+struct C6 { char c; };
+struct C7 : virtual C6, virtual C3, C1 {};
+struct C8 : C7 {};
+
+void
+foo (C0 *c0, C1 *c1, C2 *c2, C3 *c3, C4 *c4, C5 *c5, C6 *c6, C7 *c7, C8 *c8)
+{
+  __builtin_clear_padding (c0);
+  __builtin_clear_padding (c1);
+  __builtin_clear_padding (c2);// { dg-error "argument 1 in call to 
function '__builtin_c

[PATCH] libgomp(OMPD PROJECT): add ICVs debugging information.

2022-03-11 Thread Mohamed Atef via Gcc-patches
Hi,
   This Patch contains the function that gets all global ICVs information,
and prototypes of local ICVs.

Notes:
1) gomp_affinity_format_len doesn't have a value so I assumed that
gomp_affinity_format has length 100 for now.

2) I didn't have any knowledge in OpenMP before this project,
so if any of the ICVs scopes is wrong please let me know.

I hope to hear from you soon, as we are running out of time.
Thanks



2022-03-12  Mohamed Atef  

*Makefile.am: add ompd-icv to libgompd_la_SOURCES.
*Makefile.in: Regenerate.
*parallel.c: fixed the call of ompd_bp_parallel_begin, and
ompd_bp_parallel_begin.
*ompd-icv.c: New file.
*omp-tools.h.in: fix soe writing formats.
*ompd-helper.h: (struct ompd_thread_handle_t, struct
ompd_parallel_handle_t,
struct ompd_task_handle_t, GET_VALUE macro, CHECK macro,
FOREACH_OMPD_ICV macro,
enum ompd_icv,
): Defined
(prototypes of ompd_get_nthread, ompd_get_thread_limit,
ompd_get_run_sched, ompd_get_run_sched_chunk_size,
ompd_get_default_device, ompd_get_dynamic,
ompd_get_max_active_levels, ompd_get_proc_bind,
ompd_is_final,ompd_is_implicit,
ompd_get_team_size, ompd_get_cancellation,
ompd_get_max_task_priority, ompd_get_stacksize, ompd_get_debug,
ompd_get_display_affinity,
ompd_get_affinity_format, ompd_get_affinity_format_len,
ompd_get_wait_policy, ompd_get_num_teams, ompd_get_teams_thread_limit,
ompd_get_spin_count,
ompd_get_available_cpus, ompd_get_throttled_spin_count, and
ompd_get_managed_threads): Added.
*ompd-init.c: GET_VALUE is used instead of a ton of lines.
*libgompd.map: (ompd_enumerate_icvs, ompd_get_icv_from_scope, and
ompd_get_icv_string_from_scope): exported.

 
diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index 22a27df105e..20d0d62f473 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -93,7 +93,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c error.c \
priority_queue.c affinity-fmt.c teams.c allocator.c oacc-profiling.c \
oacc-target.c ompd-support.c
 
-libgompd_la_SOURCES = ompd-init.c ompd-helper.c
+libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c
 include $(top_srcdir)/plugin/Makefrag.am
 
 if USE_FORTRAN
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index 8ecf6dcf192..df0986ee5c2 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -223,7 +223,7 @@ am_libgomp_la_OBJECTS = alloc.lo atomic.lo barrier.lo 
critical.lo \
oacc-target.lo ompd-support.lo $(am__objects_1)
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
 libgompd_la_LIBADD =
-am_libgompd_la_OBJECTS = ompd-init.lo ompd-helper.lo
+am_libgompd_la_OBJECTS = ompd-init.lo ompd-helper.lo ompd-icv.lo
 libgompd_la_OBJECTS = $(am_libgompd_la_OBJECTS)
 AM_V_P = $(am__v_P_@AM_V@)
 am__v_P_ = $(am__v_P_@AM_DEFAULT_V@)
@@ -258,15 +258,16 @@ am__depfiles_remade = ./$(DEPDIR)/affinity-fmt.Plo \
./$(DEPDIR)/oacc-mem.Plo ./$(DEPDIR)/oacc-parallel.Plo \
./$(DEPDIR)/oacc-plugin.Plo ./$(DEPDIR)/oacc-profiling.Plo \
./$(DEPDIR)/oacc-target.Plo ./$(DEPDIR)/ompd-helper.Plo \
-   ./$(DEPDIR)/ompd-init.Plo ./$(DEPDIR)/ompd-support.Plo \
-   ./$(DEPDIR)/ordered.Plo ./$(DEPDIR)/parallel.Plo \
-   ./$(DEPDIR)/priority_queue.Plo ./$(DEPDIR)/proc.Plo \
-   ./$(DEPDIR)/ptrlock.Plo ./$(DEPDIR)/scope.Plo \
-   ./$(DEPDIR)/sections.Plo ./$(DEPDIR)/sem.Plo \
-   ./$(DEPDIR)/single.Plo ./$(DEPDIR)/splay-tree.Plo \
-   ./$(DEPDIR)/target.Plo ./$(DEPDIR)/task.Plo \
-   ./$(DEPDIR)/team.Plo ./$(DEPDIR)/teams.Plo \
-   ./$(DEPDIR)/time.Plo ./$(DEPDIR)/work.Plo
+   ./$(DEPDIR)/ompd-icv.Plo ./$(DEPDIR)/ompd-init.Plo \
+   ./$(DEPDIR)/ompd-support.Plo ./$(DEPDIR)/ordered.Plo \
+   ./$(DEPDIR)/parallel.Plo ./$(DEPDIR)/priority_queue.Plo \
+   ./$(DEPDIR)/proc.Plo ./$(DEPDIR)/ptrlock.Plo \
+   ./$(DEPDIR)/scope.Plo ./$(DEPDIR)/sections.Plo \
+   ./$(DEPDIR)/sem.Plo ./$(DEPDIR)/single.Plo \
+   ./$(DEPDIR)/splay-tree.Plo ./$(DEPDIR)/target.Plo \
+   ./$(DEPDIR)/task.Plo ./$(DEPDIR)/team.Plo \
+   ./$(DEPDIR)/teams.Plo ./$(DEPDIR)/time.Plo \
+   ./$(DEPDIR)/work.Plo
 am__mv = mv -f
 COMPILE = $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) \
$(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS)
@@ -605,7 +606,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c \
oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
affinity-fmt.c teams.c allocator.c oacc-profiling.c \
oacc-target.c ompd-support.c $(am__append_3)
-libgompd_la_SOURCES = ompd-init.c ompd-helper.c
+libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c
 
 # Nvidia PTX OpenACC plugin.
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info 
$(libtool_VERSION)
@@ -818,6 +819,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-profiling.Plo@am__quote@ 
# am--include-ma

New German PO file for 'gcc' (version 12.1-b20220213)

2022-03-11 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

https://translationproject.org/latest/gcc/de.po

(This file, 'gcc-12.1-b20220213.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.