Re: [PATCH] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-04-01 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 1, 2022 at 2:54 PM Richard Biener via Gcc-patches
 wrote:
>
> On Fri, Apr 1, 2022 at 8:47 AM liuhongt via Gcc-patches
>  wrote:
> >
> > Update in V2:
> > 1. Use get_insns instead of FOR_EACH_BB_CFUN and FOR_BB_INSNS.
> > 2. Return for any_uncondjump_p and ANY_RETURN_P.
> > 3. Add dump info for spliting instruction.
> > 4. Restrict ix86_split_stlf_stall_load under TARGET_SSE2.
> >
> > Since cfg is freed before machine_reorg, just do a rough calculation
> > of the window according to the layout.
> > Also according to an experiment on CLX, set window size to 64.
> >
> > Currently only handle V2DFmode load since it doesn't need any scratch
> > registers, and it's sufficient to recover cray performance for -O2
> > compared to GCC11.
> >
> > gcc/ChangeLog:
> >
> > PR target/101908
> > * config/i386/i386.cc (ix86_split_stlf_stall_load): New
> > function
> > (ix86_reorg): Call ix86_split_stlf_stall_load.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/pr101908-1.c: New test.
> > * gcc.target/i386/pr101908-2.c: New test.
> > ---
> >  gcc/config/i386/i386.cc| 60 ++
> >  gcc/testsuite/gcc.target/i386/pr101908-1.c | 12 +
> >  gcc/testsuite/gcc.target/i386/pr101908-2.c | 12 +
> >  3 files changed, 84 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-2.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index 5a561966eb4..c88a689f32b 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -21933,6 +21933,64 @@ ix86_seh_fixup_eh_fallthru (void)
> >emit_insn_after (gen_nops (const1_rtx), insn);
> >  }
> >  }
> > +/* Split vector load from parm_decl to elemental loads to avoid STLF
> > +   stalls.  */
> > +static void
> > +ix86_split_stlf_stall_load ()
> > +{
> > +  rtx_insn* insn, *start = get_insns ();
> > +  unsigned window = 0;
> > +
> > +  for (insn = start; insn; insn = NEXT_INSN (insn))
> > +{
> > +  if (!NONDEBUG_INSN_P (insn))
> > +   continue;
> > +  window++;
> > +  /* Insert 64 vaddps %xmm18, %xmm19, %xmm20(no dependence between each
> > +other, just emulate for pipeline) before stalled load, stlf stall
> > +case is as fast as no stall cases on CLX.
> > +Since CFG is freed before machine_reorg, just do a rough
> > +calculation of the window according to the layout.  */
> > +  if (window > 64)
>
> I think we want to turn the '64' into a --param at least.  You can add
>
> -param=x86-stlf-window-ninsns=
>
> into i386.opt (see -param= examples in aarch64/ for example).
Sure.
>
> > +   return;
> > +
> > +  if (any_uncondjump_p (insn)
> > + || ANY_RETURN_P (PATTERN (insn)))
>
> You made a point about calls - does any_uncondjump_p cover them?
>
No, I prefer excluding calls which could take sufficient time to
compensate for the STLF stall.
> otherwise I think this is fine, Honza, do you agree?
>
> Thanks,
> Richard.
>
> > +   return;
> > +
> > +  rtx set = single_set (insn);
> > +  if (!set)
> > +   continue;
> > +  rtx src = SET_SRC (set);
> > +  if (!MEM_P (src)
> > + /* Only handle V2DFmode load since it doesn't need any scratch
> > +register.  */
> > + || GET_MODE (src) != E_V2DFmode
> > + || !MEM_EXPR (src)
> > + || TREE_CODE (get_base_address (MEM_EXPR (src))) != PARM_DECL
> > +   continue;
> > +
> > +  rtx zero = CONST0_RTX (V2DFmode);
> > +  rtx dest = SET_DEST (set);
> > +  rtx m = adjust_address (src, DFmode, 0);
> > +  rtx loadlpd = gen_sse2_loadlpd (dest, zero, m);
> > +  emit_insn_before (loadlpd, insn);
> > +  m = adjust_address (src, DFmode, 8);
> > +  rtx loadhpd = gen_sse2_loadhpd (dest, dest, m);
> > +  if (dump_file && (dump_flags & TDF_DETAILS))
> > +   {
> > + fputs ("Due to potential STLF stall, split instruction:\n",
> > +dump_file);
> > + print_rtl_single (dump_file, insn);
> > + fputs ("To:\n", dump_file);
> > + print_rtl_single (dump_file, loadlpd);
> > + print_rtl_single (dump_file, loadhpd);
> > +   }
> > +  PATTERN (insn) = loadhpd;
> > +  INSN_CODE (insn) = -1;
> > +  gcc_assert (recog_memoized (insn) != -1);
> > +}
> > +}
> >
> >  /* Implement machine specific optimizations.  We implement padding of 
> > returns
> > for K8 CPUs and pass to avoid 4 jumps in the single 16 byte window.  */
> > @@ -21948,6 +22006,8 @@ ix86_reorg (void)
> >
> >if (optimize && optimize_function_for_speed_p (cfun))
> >  {
> > +  if (TARGET_SSE2)
> > +   ix86_split_stlf_stall_load ();
> >if (TARGET_PAD_SHORT_FUNCTION)
> > ix86_pad_short_function ();
> >else if (TARGET_PAD_RETURNS)
> > diff --git a/gcc/testsuite/gcc.target/i386/pr101908-1.c 
> > b/g

Re: [PATCH] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-04-01 Thread Richard Biener via Gcc-patches
On Fri, Apr 1, 2022 at 9:14 AM Hongtao Liu  wrote:
>
> On Fri, Apr 1, 2022 at 2:54 PM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Fri, Apr 1, 2022 at 8:47 AM liuhongt via Gcc-patches
> >  wrote:
> > >
> > > Update in V2:
> > > 1. Use get_insns instead of FOR_EACH_BB_CFUN and FOR_BB_INSNS.
> > > 2. Return for any_uncondjump_p and ANY_RETURN_P.
> > > 3. Add dump info for spliting instruction.
> > > 4. Restrict ix86_split_stlf_stall_load under TARGET_SSE2.
> > >
> > > Since cfg is freed before machine_reorg, just do a rough calculation
> > > of the window according to the layout.
> > > Also according to an experiment on CLX, set window size to 64.
> > >
> > > Currently only handle V2DFmode load since it doesn't need any scratch
> > > registers, and it's sufficient to recover cray performance for -O2
> > > compared to GCC11.
> > >
> > > gcc/ChangeLog:
> > >
> > > PR target/101908
> > > * config/i386/i386.cc (ix86_split_stlf_stall_load): New
> > > function
> > > (ix86_reorg): Call ix86_split_stlf_stall_load.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/i386/pr101908-1.c: New test.
> > > * gcc.target/i386/pr101908-2.c: New test.
> > > ---
> > >  gcc/config/i386/i386.cc| 60 ++
> > >  gcc/testsuite/gcc.target/i386/pr101908-1.c | 12 +
> > >  gcc/testsuite/gcc.target/i386/pr101908-2.c | 12 +
> > >  3 files changed, 84 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-2.c
> > >
> > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > index 5a561966eb4..c88a689f32b 100644
> > > --- a/gcc/config/i386/i386.cc
> > > +++ b/gcc/config/i386/i386.cc
> > > @@ -21933,6 +21933,64 @@ ix86_seh_fixup_eh_fallthru (void)
> > >emit_insn_after (gen_nops (const1_rtx), insn);
> > >  }
> > >  }
> > > +/* Split vector load from parm_decl to elemental loads to avoid STLF
> > > +   stalls.  */
> > > +static void
> > > +ix86_split_stlf_stall_load ()
> > > +{
> > > +  rtx_insn* insn, *start = get_insns ();
> > > +  unsigned window = 0;
> > > +
> > > +  for (insn = start; insn; insn = NEXT_INSN (insn))
> > > +{
> > > +  if (!NONDEBUG_INSN_P (insn))
> > > +   continue;
> > > +  window++;
> > > +  /* Insert 64 vaddps %xmm18, %xmm19, %xmm20(no dependence between 
> > > each
> > > +other, just emulate for pipeline) before stalled load, stlf stall
> > > +case is as fast as no stall cases on CLX.
> > > +Since CFG is freed before machine_reorg, just do a rough
> > > +calculation of the window according to the layout.  */
> > > +  if (window > 64)
> >
> > I think we want to turn the '64' into a --param at least.  You can add
> >
> > -param=x86-stlf-window-ninsns=
> >
> > into i386.opt (see -param= examples in aarch64/ for example).
> Sure.
> >
> > > +   return;
> > > +
> > > +  if (any_uncondjump_p (insn)
> > > + || ANY_RETURN_P (PATTERN (insn)))
> >
> > You made a point about calls - does any_uncondjump_p cover them?
> >
> No, I prefer excluding calls which could take sufficient time to
> compensate for the STLF stall.

So I guess CALL_P (insn) could check for them, I agree we can stop looking
at calls.

> > otherwise I think this is fine, Honza, do you agree?
> >
> > Thanks,
> > Richard.
> >
> > > +   return;
> > > +
> > > +  rtx set = single_set (insn);
> > > +  if (!set)
> > > +   continue;
> > > +  rtx src = SET_SRC (set);
> > > +  if (!MEM_P (src)
> > > + /* Only handle V2DFmode load since it doesn't need any scratch
> > > +register.  */
> > > + || GET_MODE (src) != E_V2DFmode
> > > + || !MEM_EXPR (src)
> > > + || TREE_CODE (get_base_address (MEM_EXPR (src))) != PARM_DECL
> > > +   continue;
> > > +
> > > +  rtx zero = CONST0_RTX (V2DFmode);
> > > +  rtx dest = SET_DEST (set);
> > > +  rtx m = adjust_address (src, DFmode, 0);
> > > +  rtx loadlpd = gen_sse2_loadlpd (dest, zero, m);
> > > +  emit_insn_before (loadlpd, insn);
> > > +  m = adjust_address (src, DFmode, 8);
> > > +  rtx loadhpd = gen_sse2_loadhpd (dest, dest, m);
> > > +  if (dump_file && (dump_flags & TDF_DETAILS))
> > > +   {
> > > + fputs ("Due to potential STLF stall, split instruction:\n",
> > > +dump_file);
> > > + print_rtl_single (dump_file, insn);
> > > + fputs ("To:\n", dump_file);
> > > + print_rtl_single (dump_file, loadlpd);
> > > + print_rtl_single (dump_file, loadhpd);
> > > +   }
> > > +  PATTERN (insn) = loadhpd;
> > > +  INSN_CODE (insn) = -1;
> > > +  gcc_assert (recog_memoized (insn) != -1);
> > > +}
> > > +}
> > >
> > >  /* Implement machine specific optimizations.  We implement padding of 
> > > returns
> > > for K8 CPUs and pass to avoid 4 jumps in the single 16

[PATCH V3] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-04-01 Thread liuhongt via Gcc-patches
Update in V3:
1. Add -param=x86-stlf-window-ninsns= (default 64).
2. Exclude call in the window.

Since cfg is freed before machine_reorg, just do a rough calculation
of the window according to the layout.
Also according to an experiment on CLX, set window size to 64.

Currently only handle V2DFmode load since it doesn't need any scratch
registers, and it's sufficient to recover cray performance for -O2
compared to GCC11.

gcc/ChangeLog:

PR target/101908
* config/i386/i386.cc (ix86_split_stlf_stall_load): New
function
(ix86_reorg): Call ix86_split_stlf_stall_load.
* config/i386/i386.opt (-param=x86-stlf-window-ninsns=): New
param.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr101908-1.c: New test.
* gcc.target/i386/pr101908-2.c: New test.
* gcc.target/i386/pr101908-3.c: New test.
---
 gcc/config/i386/i386.cc| 61 ++
 gcc/config/i386/i386.opt   |  4 ++
 gcc/testsuite/gcc.target/i386/pr101908-1.c | 12 +
 gcc/testsuite/gcc.target/i386/pr101908-2.c | 12 +
 gcc/testsuite/gcc.target/i386/pr101908-3.c | 14 +
 5 files changed, 103 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-3.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 5a561966eb4..3f8a2c7932d 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -21933,6 +21933,65 @@ ix86_seh_fixup_eh_fallthru (void)
   emit_insn_after (gen_nops (const1_rtx), insn);
 }
 }
+/* Split vector load from parm_decl to elemental loads to avoid STLF
+   stalls.  */
+static void
+ix86_split_stlf_stall_load ()
+{
+  rtx_insn* insn, *start = get_insns ();
+  unsigned window = 0;
+
+  for (insn = start; insn; insn = NEXT_INSN (insn))
+{
+  if (!NONDEBUG_INSN_P (insn))
+   continue;
+  window++;
+  /* Insert 64 vaddps %xmm18, %xmm19, %xmm20(no dependence between each
+other, just emulate for pipeline) before stalled load, stlf stall
+case is as fast as no stall cases on CLX.
+Since CFG is freed before machine_reorg, just do a rough
+calculation of the window according to the layout.  */
+  if (window > (unsigned) x86_stlf_window_ninsns)
+   return;
+
+  if (any_uncondjump_p (insn)
+ || ANY_RETURN_P (PATTERN (insn))
+ || CALL_P (insn))
+   return;
+
+  rtx set = single_set (insn);
+  if (!set)
+   continue;
+  rtx src = SET_SRC (set);
+  if (!MEM_P (src)
+ /* Only handle V2DFmode load since it doesn't need any scratch
+register.  */
+ || GET_MODE (src) != E_V2DFmode
+ || !MEM_EXPR (src)
+ || TREE_CODE (get_base_address (MEM_EXPR (src))) != PARM_DECL)
+   continue;
+
+  rtx zero = CONST0_RTX (V2DFmode);
+  rtx dest = SET_DEST (set);
+  rtx m = adjust_address (src, DFmode, 0);
+  rtx loadlpd = gen_sse2_loadlpd (dest, zero, m);
+  emit_insn_before (loadlpd, insn);
+  m = adjust_address (src, DFmode, 8);
+  rtx loadhpd = gen_sse2_loadhpd (dest, dest, m);
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fputs ("Due to potential STLF stall, split instruction:\n",
+dump_file);
+ print_rtl_single (dump_file, insn);
+ fputs ("To:\n", dump_file);
+ print_rtl_single (dump_file, loadlpd);
+ print_rtl_single (dump_file, loadhpd);
+   }
+  PATTERN (insn) = loadhpd;
+  INSN_CODE (insn) = -1;
+  gcc_assert (recog_memoized (insn) != -1);
+}
+}
 
 /* Implement machine specific optimizations.  We implement padding of returns
for K8 CPUs and pass to avoid 4 jumps in the single 16 byte window.  */
@@ -21948,6 +22007,8 @@ ix86_reorg (void)
 
   if (optimize && optimize_function_for_speed_p (cfun))
 {
+  if (TARGET_SSE2)
+   ix86_split_stlf_stall_load ();
   if (TARGET_PAD_SHORT_FUNCTION)
ix86_pad_short_function ();
   else if (TARGET_PAD_RETURNS)
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index d8e8656a8ab..a6b0e28f238 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1210,3 +1210,7 @@ Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, 
AVX2, AVX512F and AVX5
 mdirect-extern-access
 Target Var(ix86_direct_extern_access) Init(1)
 Do not use GOT to access external symbols.
+
+-param=x86-stlf-window-ninsns=
+Target Joined UInteger Var(x86_stlf_window_ninsns) Init(64) Param
+Instructions number above which STFL stall penalty can be compensated.
diff --git a/gcc/testsuite/gcc.target/i386/pr101908-1.c 
b/gcc/testsuite/gcc.target/i386/pr101908-1.c
new file mode 100644
index 000..33d9684f0ad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101908-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mn

[PATCH] sh: Fix up __attribute__((optimize ("Os"))) handling on SH [PR105069]

2022-04-01 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PR, various tests on sh-elf ICE like:
make check-gcc RUNTESTFLAGS="compile.exp='pr104327.c pr58332.c pr81360.c 
pr84425.c'"
FAIL: gcc.c-torture/compile/pr104327.c   -O0  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr104327.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/pr104327.c   -O1  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr104327.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/pr104327.c   -O2  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr104327.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/pr104327.c   -O3 -g  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr104327.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/pr104327.c   -Os  (test for excess errors)
FAIL: gcc.c-torture/compile/pr58332.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/pr58332.c   -O1  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr58332.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/pr58332.c   -O2  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr58332.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/pr58332.c   -O3 -g  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr58332.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/pr58332.c   -Os  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr58332.c   -Os  (test for excess errors)
FAIL: gcc.c-torture/compile/pr81360.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/pr81360.c   -O1  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr81360.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/pr81360.c   -O2  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr81360.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/pr81360.c   -O3 -g  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr81360.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/pr81360.c   -Os  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr81360.c   -Os  (test for excess errors)
FAIL: gcc.c-torture/compile/pr84425.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/pr84425.c   -O1  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr84425.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/pr84425.c   -O2  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr84425.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/pr84425.c   -O3 -g  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr84425.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/pr84425.c   -Os  (internal compiler error: 
'global_options' are modified in local context)
FAIL: gcc.c-torture/compile/pr84425.c   -Os  (test for excess errors)
With the following patch, none of those tests ICE anymore, though
pr104327.c still FAILs with:
Excess errors:
/usr/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr104327.c:6:1: error: 
inlining failed in call to 'always_inline' 'bar': target specific option 
mismatch
I think that would be fixable by overriding TARGET_CAN_INLINE_P
hook and allowing at least for always_inline changes in sh_div_str.

Is the following patch ok for trunk as at least a small step forward?

2022-03-31  Jakub Jelinek  

PR target/105069
* config/sh/sh.opt (mdiv=): Add Save.

--- gcc/config/sh/sh.opt.jj 2022-01-11 23:11:21.990295775 +0100
+++ gcc/config/sh/sh.opt2022-03-31 09:43:45.916244944 +0200
@@ -207,7 +207,7 @@ Target RejectNegative Mask(ALIGN_DOUBLE)
 Align doubles at 64-bit boundaries.
 
 mdiv=
-Target RejectNegative Joined Var(sh_div_str) Init("")
+Target Save RejectNegative Joined Var(sh_div_str) Init("")
 Division strategy, one of: call-div1, call-fp, call-table.
 
 mdivsi3_libfunc=

Jakub



[PATCH] testsuite: Add further zero size elt passing tests [PR102024]

2022-04-01 Thread Jakub Jelinek via Gcc-patches
Hi!

As discussed in PR102024, zero width bitfields might not be the only ones
causing ABI issues at least on mips, zero size arrays or (in C only) zero
sized (empty) structures can be problematic too.

The following patch adds some coverage for it too.

Tested on x86_64-linux with
make check-gcc check-g++ RUNTESTFLAGS='ALT_CC_UNDER_TEST=gcc 
ALT_CXX_UNDER_TEST=g++ --target_board=unix\{-m32,-m64\} compat.exp=pr102024*'
make check-gcc check-g++ RUNTESTFLAGS='ALT_CC_UNDER_TEST=clang 
ALT_CXX_UNDER_TEST=clang++ --target_board=unix\{-m32,-m64\} 
compat.exp=pr102024*'
with gcc/g++ 10.3 and clang 11.  Everything but (expectedly)
FAIL: gcc.dg/compat/pr102024 c_compat_x_tst.o-c_compat_y_alt.o execute 
FAIL: gcc.dg/compat/pr102024 c_compat_x_alt.o-c_compat_y_tst.o execute 
for -m64 ALT_CC_UNDER_TEST=gcc passes.

Ok for trunk?

2022-03-31  Jakub Jelinek  

PR target/102024
* gcc.dg/compat/pr102024_test.h: Add further tests with zero sized
structures and arrays.
* g++.dg/compat/pr102024_test.h: Add further tests with zero sized
arrays.

--- gcc/testsuite/gcc.dg/compat/pr102024_test.h.jj  2022-03-24 
12:24:41.625100842 +0100
+++ gcc/testsuite/gcc.dg/compat/pr102024_test.h 2022-03-31 17:36:33.486710917 
+0200
@@ -4,3 +4,9 @@ T(2,int:0;float a;,F(2,a,2.25f,16.5f))
 T(3,double a;long long:0;double b;,F(3,a,42.0,43.125)F(3,b,-17.5,35.75))
 T(4,double a;long long:0;,F(4,a,1.0,17.125))
 T(5,long long:0;double a;,F(5,a,2.25,16.5))
+T(6,float a;struct{}b;float c;,F(6,a,42.0f,43.125f)F(6,c,-17.5f,35.75f))
+T(7,float a;struct{}b[0];;,F(7,a,1.0f,17.125f))
+T(8,int a[0];float b;,F(8,b,2.25f,16.5f))
+T(9,double a;long long b[0];double c;,F(9,a,42.0,43.125)F(9,c,-17.5,35.75))
+T(10,double a;struct{}b;,F(10,a,1.0,17.125))
+T(11,struct{}a[0];double b;,F(11,b,2.25,16.5))
--- gcc/testsuite/g++.dg/compat/pr102024_test.h.jj  2022-03-24 
12:24:41.625100842 +0100
+++ gcc/testsuite/g++.dg/compat/pr102024_test.h 2022-03-31 17:37:30.763877562 
+0200
@@ -4,3 +4,9 @@ T(2,int:0;float a;,F(2,a,2.25f,16.5f))
 T(3,double a;long long:0;double b;,F(3,a,42.0,43.125)F(3,b,-17.5,35.75))
 T(4,double a;long long:0;,F(4,a,1.0,17.125))
 T(5,long long:0;double a;,F(5,a,2.25,16.5))
+T(6,float a;struct{}b[0];float c;,F(6,a,42.0f,43.125f)F(6,c,-17.5f,35.75f))
+T(7,float a;struct{}b[0];;,F(7,a,1.0f,17.125f))
+T(8,int a[0];float b;,F(8,b,2.25f,16.5f))
+T(9,double a;long long b[0];double c;,F(9,a,42.0,43.125)F(9,c,-17.5,35.75))
+T(10,double a;struct{}b[0];,F(10,a,1.0,17.125))
+T(11,struct{}a[0];double b;,F(11,b,2.25,16.5))

Jakub



Re: [PATCH] mips: Ignore zero width bitfields in arguments and issue -Wpsabi warning about C zero-width bit-field ABI changes [PR102024]

2022-04-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Apr 01, 2022 at 12:27:45AM +0800, Xi Ruoyao wrote:
> --- a/gcc/config/mips/mips.cc
> +++ b/gcc/config/mips/mips.cc
> @@ -6042,11 +6042,26 @@ mips_function_arg (cumulative_args_t cum_v, const 
> function_arg_info &arg)
> for (i = 0; i < info.reg_words; i++)
>   {
> rtx reg;
> +   int has_zero_width_bf_abi_change = 0;

If something has just 0/1 value, perhaps better use bool and false/true
for the values?

> for (; field; field = DECL_CHAIN (field))
> - if (TREE_CODE (field) == FIELD_DECL
> - && int_bit_position (field) >= bitpos)
> -   break;
> + {
> +   if (TREE_CODE (field) != FIELD_DECL)
> + continue;
> +
> +   /* Ignore zero-width bit-fields.  And, if the ignored
> +  field is not from C++, it may be an ABI change.  */
> +   if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field))
> + continue;

While the above is only about zero width bit-fields from the C++ FE,

> +   if (integer_zerop (DECL_SIZE (field)))

this matches both zero width bitfields and zero sized structures
and zero length arrays.  While I believe the same
arguments as for zero width bitfields apply for those and it is
interesting to see what say LLVM does with those (see the
compat testsuite change I've posted today), either the diagnostics
should be worded so that it covers both, or this could be
a 0/1/2 value flag or enum which then decides how to word the
diagnostics.
Including the _bf in the name is misleading.
> + {
> +   has_zero_width_bf_abi_change = 1;
> +   continue;
> + }

> +   static const char *url =
> + CHANGES_ROOT_URL
> + "gcc-12/changes.html#zero_width_bitfields";

Formatting guidelines say that the = should go on the next line.

> +   inform (input_location,
> +   "the ABI for passing a value containing "
> +   "zero-width bit-fields before an adjacent "
> +   "64-bit floating-point field was retconned "
> +   "in GCC %{12.1%}", url);

Not a native speaker, but retconned seems weird and would be better to
match the wording in other targets.

> +   last_reported_type_uid = uid;
> + }
> + }
>  
> XVECEXP (ret, 0, i)
>   = gen_rtx_EXPR_LIST (VOIDmode, reg,

Jakub



Re: [PATCH] mips: Emit psabi diagnostic for return values affected by C++ zero-width bit-field ABI change [PR 102024]

2022-04-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Apr 01, 2022 at 12:13:30AM +0800, Xi Ruoyao wrote:
> --- a/gcc/config/mips/mips.cc
> +++ b/gcc/config/mips/mips.cc
> @@ -6274,10 +6274,17 @@ mips_callee_copies (cumulative_args_t, const 
> function_arg_info &arg)
>  
> For n32 & n64, a structure with one or two fields is returned in
> floating-point registers as long as every field has a floating-point
> -   type.  */
> +   type.
> +
> +   The C++ FE used to remove zero-width bit-fields in GCC 11 and earlier.
> +   To make a proper diagnostic, this function will set HAS_ZERO_WIDTH_BF
> +   to 1 once a C++ zero-width bit-field shows up, and then ignore it.
> +   Then the caller can determine if this zero-width bit-field will make a
> +   difference and emit a -Wpsabi inform.  */
>  
>  static int
> -mips_fpr_return_fields (const_tree valtype, tree *fields)
> +mips_fpr_return_fields (const_tree valtype, tree *fields,
> + int *has_zero_width_bf)

Use bool * and = true ?

> @@ -6319,11 +6332,14 @@ static bool
>  mips_return_in_msb (const_tree valtype)
>  {
>tree fields[2];
> +  int has_zero_width_bf = 0;

I think it would be better to match more closely what the code did before.
  if (!TARGET_NEWABI || !TARGET_BIG_ENDIAN || !AGGREGATE_TYPE_P (valtype))
return false;

  tree fields[2];
  bool has_zero_width_bf = false;
  return (mips_fpr_return_fields (valtype, fields, &has_zero_width_bf) == 0
  || has_zero_width_bf);

> +  int use_fpr = mips_fpr_return_fields (valtype, fields,
> + &has_zero_width_bf);
>  
>return (TARGET_NEWABI
> && TARGET_BIG_ENDIAN
> && AGGREGATE_TYPE_P (valtype)
> -   && mips_fpr_return_fields (valtype, fields) == 0);
> +   && (use_fpr == 0 || has_zero_width_bf));
>  }
>  
>  /* Return true if the function return value MODE will get returned in a
> @@ -6418,8 +6434,35 @@ mips_function_value_1 (const_tree valtype, const_tree 
> fn_decl_or_type,
>return values, promote the mode here too.  */
>mode = promote_function_mode (valtype, mode, &unsigned_p, func, 1);
>  
> +  int has_zero_width_bf = 0;

See above for int -> bool and 0 -> false.

> +  int use_fpr = mips_fpr_return_fields (valtype, fields,
> + &has_zero_width_bf);
> +  if (TARGET_HARD_FLOAT &&
> +   warn_psabi &&
> +   has_zero_width_bf &&
> +   use_fpr != 0)

&&s need to go at the start of next line, not end of line.

> + {
> +   static unsigned last_reported_type_uid;
> +   unsigned uid = TYPE_UID (TYPE_MAIN_VARIANT (valtype));
> +   if (uid != last_reported_type_uid)
> + {
> +   static const char *url =

Similarly for =

> + CHANGES_ROOT_URL
> + "gcc-12/changes.html#zero_width_bitfields";
> +   inform (input_location,
> +   "the ABI for returning a value containing "
> +   "zero-width bit-fields but otherwise an aggregate "
> +   "with only one or two floating-point fields was "
> +   "retconned in GCC %{12.1%}", url);

In this diagnostics it is ok to talk about bit-fields as that is
the only thing that changes (compared to the other patch).
But again, was retconned -> changed ?

Jakub



[PATCH, OpenMP] Fix nested use_device_ptr

2022-04-01 Thread Chung-Lin Tang

Hi Jakub,
this patch fixes a bug in lower_omp_target, where for Fortran arrays,
the expanded sender assignment is wrongly using the variable in the
current ctx, instead of the one looked-up outside, which is causing
use_device_ptr/addr to fail to work when used inside an omp-parallel
(where the omp child_fn is split away from the original).
Just a one-character change to fix this.

The fix is inside omp-low.cc, though because the omp_array_data langhook
is used only by Fortran, this is essentially Fortran-specific.

Tested on x86_64-linux + nvptx offloading without regressions.
This is probably not a regression, but seeking to commit when stage1 opens.

Thanks,
Chung-Lin

2022-04-01  Chung-Lin Tang  

gcc/ChangeLog:

* omp-low.cc (lower_omp_target): Use outer context looked-up 'var' as
argument to lang_hooks.decls.omp_array_data, instead of 'ovar' from
current clause.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/use_device_ptr-4.f90: New testcase.

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 392bb18..bf5779b 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -13405,7 +13405,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
 
type = TREE_TYPE (ovar);
if (lang_hooks.decls.omp_array_data (ovar, true))
- var = lang_hooks.decls.omp_array_data (ovar, false);
+ var = lang_hooks.decls.omp_array_data (var, false);
else if (((OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_ADDR
  || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR)
  && !omp_privatize_by_reference (ovar)
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_ptr-4.f90 
b/libgomp/testsuite/libgomp.fortran/use_device_ptr-4.f90
new file mode 100644
index 000..8c361d1
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/use_device_ptr-4.f90
@@ -0,0 +1,41 @@
+! { dg-do run }
+!
+! Test user_device_ptr nested within another parallel
+! construct
+!
+program test_nested_use_device_ptr
+  use iso_c_binding, only: c_loc, c_ptr
+  implicit none
+  real, allocatable, target :: arr(:,:)
+  integer :: width = 1024, height = 1024, i
+  type(c_ptr) :: devptr
+
+  allocate(arr(width,height))
+
+  !$omp target enter data map(alloc: arr)
+
+  !$omp target data use_device_ptr(arr)
+  devptr = c_loc(arr(1,1))
+  !$omp end target data
+
+  !$omp parallel default(none) shared(arr, devptr)
+  !$omp single
+
+  !$omp target data use_device_ptr(arr)
+  call thing(c_loc(arr), devptr)
+  !$omp end target data
+
+  !$omp end single
+  !$omp end parallel
+  !$omp target exit data map(delete: arr)
+
+contains
+
+  subroutine thing(myarr, devptr)
+use iso_c_binding, only: c_ptr, c_associated
+implicit none
+type(c_ptr) :: myarr, devptr
+if (.not.c_associated(myarr, devptr)) stop 1
+  end subroutine thing
+
+end program


[committed] contrib: Fix up spelling of loongarch-str.h dependency [PR105114]

2022-04-01 Thread Jakub Jelinek via Gcc-patches
Hi!

As found by Joseph, the dependency of
gcc/config/loongarch/loongarch-str.h is spelled incorrectly,
it should be
gcc/config/loongarch/genopts/loongarch-strings
but was using
gcc/config/loongarch/genopts/loongarch-string

Committed to trunk as obvious.

2022-03-31  Jakub Jelinek  
Joseph Myers  

PR other/105114
* gcc_update: Fix up spelling of
gcc/config/loongarch/genopts/loongarch-strings dependency.

--- contrib/gcc_update.jj
+++ contrib/gcc_update
@@ -86,7 +86,7 @@ gcc/config/arm/arm-tables.opt: gcc/config/arm/arm-cpus.in 
gcc/config/arm/parsecp
 gcc/config/c6x/c6x-tables.opt: gcc/config/c6x/c6x-isas.def 
gcc/config/c6x/genopt.sh
 gcc/config/c6x/c6x-sched.md: gcc/config/c6x/c6x-sched.md.in 
gcc/config/c6x/gensched.sh
 gcc/config/c6x/c6x-mult.md: gcc/config/c6x/c6x-mult.md.in 
gcc/config/c6x/genmult.sh
-gcc/config/loongarch/loongarch-str.h: gcc/config/loongarch/genopts/genstr.sh 
gcc/config/loongarch/genopts/loongarch-string
+gcc/config/loongarch/loongarch-str.h: gcc/config/loongarch/genopts/genstr.sh 
gcc/config/loongarch/genopts/loongarch-strings
 gcc/config/loongarch/loongarch.opt: gcc/config/loongarch/genopts/genstr.sh 
gcc/config/loongarch/genopts/loongarch.opt.in
 gcc/config/m68k/m68k-tables.opt: gcc/config/m68k/m68k-devices.def 
gcc/config/m68k/m68k-isas.def gcc/config/m68k/m68k-microarchs.def 
gcc/config/m68k/genopt.sh
 gcc/config/mips/mips-tables.opt: gcc/config/mips/mips-cpus.def 
gcc/config/mips/genopt.sh

Jakub



Re: [PATCH, OpenMP] Fix nested use_device_ptr

2022-04-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Apr 01, 2022 at 05:02:36PM +0800, Chung-Lin Tang wrote:
> this patch fixes a bug in lower_omp_target, where for Fortran arrays,
> the expanded sender assignment is wrongly using the variable in the
> current ctx, instead of the one looked-up outside, which is causing
> use_device_ptr/addr to fail to work when used inside an omp-parallel
> (where the omp child_fn is split away from the original).
> Just a one-character change to fix this.
> 
> The fix is inside omp-low.cc, though because the omp_array_data langhook
> is used only by Fortran, this is essentially Fortran-specific.
> 
> Tested on x86_64-linux + nvptx offloading without regressions.
> This is probably not a regression, but seeking to commit when stage1 opens.
> 
> Thanks,
> Chung-Lin
> 
> 2022-04-01  Chung-Lin Tang  
> 
> gcc/ChangeLog:
> 
>   * omp-low.cc (lower_omp_target): Use outer context looked-up 'var' as
>   argument to lang_hooks.decls.omp_array_data, instead of 'ovar' from
>   current clause.
>   
> libgomp/ChangeLog:
> 
>   * testsuite/libgomp.fortran/use_device_ptr-4.f90: New testcase.

Ok, thanks.

Jakub



[PATCH] phiopt: Improve value_replacement [PR104645]

2022-04-01 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch fixes the P1 regression by reusing existing
value_replacement code.  That function already has code to
handle simple preparation statements (casts, and +,&,|,^ binary
assignments) before a final binary assignment (which can be
much wider range of ops).  When we have e.g.
  if (y_3(D) == 0)
goto ;
  else
goto ;
 :
  y_4 = y_3(D) & 31;
  _1 = (int) y_4;
  _6 = x_5(D) r<< _1;
 :
  # _2 = PHI 
the preparation statements y_4 = y_3(D) & 31; and
_1 = (int) y_4; are handled by constant evaluation, passing through
y_3(D) = 0 initially and propagating that through the assignments
with checking that UB isn't invoked.  But the final
_6 = x_5(D) r<< _1; assign is handled differently, either through
neutral_element_p or absorbing_element_p.
In the first function below we now have:
   [local count: 1073741824]:
  if (i_2(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  _3 = i_2(D) & 1;
  iftmp.0_4 = (int) _3;

   [local count: 1073741824]:
  # iftmp.0_1 = PHI 
where in GCC 11 we had:
   :
  if (i_3(D) != 0)
goto ; [INV]
  else
goto ; [INV]

   :
  i.1_1 = (int) i_3(D);
  iftmp.0_5 = i.1_1 & 1;

   :
  # iftmp.0_2 = PHI 
Current value_replacement can handle the latter as the last
stmt of middle_bb is a binary op that in this case satisfies
absorbing_element_p.
But the former we can't handle, as the last stmt in middle_bb
is a cast.

The patch makes it work in that case by pretending all of middle_bb
are the preparation statements and there is no binary assign at the
end, so everything is handled through the constant evaluation.
We simply set at the start of middle_bb the lhs of comparison
virtually to the rhs, propagate it through and at the end
see if virtually the arg0 of the PHI is equal to arg1 of it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

For GCC 13, I think we just should throw away all the neutral/absorbing
element stuff and do the constant evaluation of the whole middle_bb
and handle that way all the ops we currently handle in neutral/absorbing
element.

2022-04-01  Jakub Jelinek  

PR tree-optimization/104645
* tree-ssa-phiopt.cc (value_replacement): If assign has
CONVERT_EXPR_CODE_P rhs_code, treat it like a preparation
statement with constant evaluation.

* gcc.dg/tree-ssa/pr104645.c: New test.

--- gcc/tree-ssa-phiopt.cc.jj   2022-01-18 11:59:00.089974814 +0100
+++ gcc/tree-ssa-phiopt.cc  2022-03-31 14:38:27.537149245 +0200
@@ -1395,11 +1395,22 @@ value_replacement (basic_block cond_bb,
 
   gimple *assign = gsi_stmt (gsi);
   if (!is_gimple_assign (assign)
-  || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS
   || (!INTEGRAL_TYPE_P (TREE_TYPE (arg0))
  && !POINTER_TYPE_P (TREE_TYPE (arg0
 return 0;
 
+  if (gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS)
+{
+  /* If last stmt of the middle_bb is a conversion, handle it like
+a preparation statement through constant evaluation with
+checking for UB.  */
+  enum tree_code sc = gimple_assign_rhs_code (assign);
+  if (CONVERT_EXPR_CODE_P (sc))
+   assign = NULL;
+  else
+   return 0;
+}
+
   /* Punt if there are (degenerate) PHIs in middle_bb, there should not be.  */
   if (!gimple_seq_empty_p (phi_nodes (middle_bb)))
 return 0;
@@ -1430,7 +1441,8 @@ value_replacement (basic_block cond_bb,
   int prep_cnt;
   for (prep_cnt = 0; ; prep_cnt++)
 {
-  gsi_prev_nondebug (&gsi);
+  if (prep_cnt || assign)
+   gsi_prev_nondebug (&gsi);
   if (gsi_end_p (gsi))
break;
 
@@ -1450,7 +1462,8 @@ value_replacement (basic_block cond_bb,
  || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
  || !INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
  || !single_imm_use (lhs, &use_p, &use_stmt)
- || use_stmt != (prep_cnt ? prep_stmt[prep_cnt - 1] : assign))
+ || ((prep_cnt || assign)
+ && use_stmt != (prep_cnt ? prep_stmt[prep_cnt - 1] : assign)))
return 0;
   switch (gimple_assign_rhs_code (g))
{
@@ -1483,10 +1496,6 @@ value_replacement (basic_block cond_bb,
 >= 3 * estimate_num_insns (cond, &eni_time_weights))
 return 0;
 
-  tree lhs = gimple_assign_lhs (assign);
-  tree rhs1 = gimple_assign_rhs1 (assign);
-  tree rhs2 = gimple_assign_rhs2 (assign);
-  enum tree_code code_def = gimple_assign_rhs_code (assign);
   tree cond_lhs = gimple_cond_lhs (cond);
   tree cond_rhs = gimple_cond_rhs (cond);
 
@@ -1516,16 +1525,39 @@ value_replacement (basic_block cond_bb,
return 0;
 }
 
+  tree lhs, rhs1, rhs2;
+  enum tree_code code_def;
+  if (assign)
+{
+  lhs = gimple_assign_lhs (assign);
+  rhs1 = gimple_assign_rhs1 (assign);
+  rhs2 = gimple_assign_rhs2 (assign);
+  code_def = gimple_assign_rhs_code (assign);
+}
+  else
+{
+  gcc_assert (prep_cnt > 0);
+  lhs = cond_lhs;
+

[committed][nvptx, testsuite] Fix gcc.target/nvptx/alias-*.c on sm_80

2022-04-01 Thread Tom de Vries via Gcc-patches
Hi,

When running test-cases gcc.target/nvptx/alias-*.c on target board
nvptx-none-run/-misa=sm_80 we run into fails because the test-cases add
-mptx=6.3, which doesn't support sm_80.

Fix this by only adding -mptx=6.3 if necessary, and simplify the test-cases by
using ptx_alias feature abstractions:
...
/* { dg-do run { target runtime_ptx_alias } } */
/* { dg-add-options ptx_alias } */
...

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx, testsuite] Fix gcc.target/nvptx/alias-*.c on sm_80

gcc/testsuite/ChangeLog:

2022-04-01  Tom de Vries  

* gcc.target/nvptx/nvptx.exp
(check_effective_target_runtime_ptx_isa_version_6_3): Rename and
generalize to ...
(check_effective_target_runtime_ptx_isa_version_at_least): .. this.
(check_effective_target_default_ptx_isa_version_at_least)
(check_effective_target_runtime_ptx_alias, add_options_for_ptx_alias):
New proc.
* gcc.target/nvptx/alias-1.c: Use "target runtime_ptx_alias" and
"dg-add-options ptx_alias".
* gcc.target/nvptx/alias-2.c: Same.
* gcc.target/nvptx/alias-3.c: Same.
* gcc.target/nvptx/alias-4.c: Same.

---
 gcc/testsuite/gcc.target/nvptx/alias-1.c |  5 +--
 gcc/testsuite/gcc.target/nvptx/alias-2.c |  5 +--
 gcc/testsuite/gcc.target/nvptx/alias-3.c |  5 +--
 gcc/testsuite/gcc.target/nvptx/alias-4.c |  5 +--
 gcc/testsuite/gcc.target/nvptx/nvptx.exp | 62 +---
 5 files changed, 70 insertions(+), 12 deletions(-)

diff --git a/gcc/testsuite/gcc.target/nvptx/alias-1.c 
b/gcc/testsuite/gcc.target/nvptx/alias-1.c
index f68716e77dd..d251eee6e42 100644
--- a/gcc/testsuite/gcc.target/nvptx/alias-1.c
+++ b/gcc/testsuite/gcc.target/nvptx/alias-1.c
@@ -1,6 +1,7 @@
 /* { dg-do link } */
-/* { dg-do run { target runtime_ptx_isa_version_6_3 } } */
-/* { dg-options "-save-temps -malias -mptx=6.3" } */
+/* { dg-do run { target runtime_ptx_alias } } */
+/* { dg-options "-save-temps" } */
+/* { dg-add-options ptx_alias } */
 
 int v;
 
diff --git a/gcc/testsuite/gcc.target/nvptx/alias-2.c 
b/gcc/testsuite/gcc.target/nvptx/alias-2.c
index e2dc9b1f5ac..96cb7e2c1ef 100644
--- a/gcc/testsuite/gcc.target/nvptx/alias-2.c
+++ b/gcc/testsuite/gcc.target/nvptx/alias-2.c
@@ -1,6 +1,7 @@
 /* { dg-do link } */
-/* { dg-do run { target runtime_ptx_isa_version_6_3 } } */
-/* { dg-options "-save-temps -malias -mptx=6.3 -O2" } */
+/* { dg-do run { target runtime_ptx_alias } } */
+/* { dg-options "-save-temps -O2" } */
+/* { dg-add-options ptx_alias } */
 
 #include "alias-1.c"
 
diff --git a/gcc/testsuite/gcc.target/nvptx/alias-3.c 
b/gcc/testsuite/gcc.target/nvptx/alias-3.c
index 60486e50826..39649e30b91 100644
--- a/gcc/testsuite/gcc.target/nvptx/alias-3.c
+++ b/gcc/testsuite/gcc.target/nvptx/alias-3.c
@@ -1,6 +1,7 @@
 /* { dg-do link } */
-/* { dg-do run { target runtime_ptx_isa_version_6_3 } } */
-/* { dg-options "-save-temps -malias -mptx=6.3" } */
+/* { dg-do run { target runtime_ptx_alias } } */
+/* { dg-options "-save-temps" } */
+/* { dg-add-options ptx_alias } */
 
 /* Copy of alias-1.c, with static __f and f.  */
 
diff --git a/gcc/testsuite/gcc.target/nvptx/alias-4.c 
b/gcc/testsuite/gcc.target/nvptx/alias-4.c
index 956150a6b3f..28163c0faa0 100644
--- a/gcc/testsuite/gcc.target/nvptx/alias-4.c
+++ b/gcc/testsuite/gcc.target/nvptx/alias-4.c
@@ -1,6 +1,7 @@
 /* { dg-do link } */
-/* { dg-do run { target runtime_ptx_isa_version_6_3 } } */
-/* { dg-options "-save-temps -malias -mptx=6.3 -O2" } */
+/* { dg-do run { target runtime_ptx_alias } } */
+/* { dg-options "-save-temps -O2" } */
+/* { dg-add-options ptx_alias } */
 
 #include "alias-3.c"
 
diff --git a/gcc/testsuite/gcc.target/nvptx/nvptx.exp 
b/gcc/testsuite/gcc.target/nvptx/nvptx.exp
index e69b6d35fed..e9622ae7aaa 100644
--- a/gcc/testsuite/gcc.target/nvptx/nvptx.exp
+++ b/gcc/testsuite/gcc.target/nvptx/nvptx.exp
@@ -25,11 +25,65 @@ if ![istarget nvptx*-*-*] then {
 # Load support procs.
 load_lib gcc-dg.exp
 
-# Return 1 if code with -mptx=6.3 can be run.
-proc check_effective_target_runtime_ptx_isa_version_6_3 { args } {
-return [check_runtime run_ptx_isa_6_3 {
+# Return 1 if code by default compiles for at least PTX ISA version
+# major.minor.
+proc check_effective_target_default_ptx_isa_version_at_least { major minor } {
+set name default_ptx_isa_version_at_least_${major}_${minor}
+
+set supported_p \
+   [concat \
+"((__PTX_ISA_VERSION_MAJOR__ == $major" \
+"  && __PTX_ISA_VERSION_MINOR__ >= $minor)" \
+" || (__PTX_ISA_VERSION_MAJOR__ > $major))"]
+
+set src \
+   [list \
+"#if $supported_p" \
+"#else" \
+"#error unsupported" \
+"#endif"]
+set src [join $src "\n"]
+
+set res [check_no_compiler_messages $name assembly $src ""]
+
+return $res
+}
+
+# Return 1 if code with PTX ISA version major.minor or higher can be run.
+proc check_effective_target_runtime_ptx_isa_

Re: [PATCH] testsuite: Add further zero size elt passing tests [PR102024]

2022-04-01 Thread Richard Biener via Gcc-patches
On Thu, 31 Mar 2022, Jakub Jelinek wrote:

> Hi!
> 
> As discussed in PR102024, zero width bitfields might not be the only ones
> causing ABI issues at least on mips, zero size arrays or (in C only) zero
> sized (empty) structures can be problematic too.
> 
> The following patch adds some coverage for it too.
> 
> Tested on x86_64-linux with
> make check-gcc check-g++ RUNTESTFLAGS='ALT_CC_UNDER_TEST=gcc 
> ALT_CXX_UNDER_TEST=g++ --target_board=unix\{-m32,-m64\} compat.exp=pr102024*'
> make check-gcc check-g++ RUNTESTFLAGS='ALT_CC_UNDER_TEST=clang 
> ALT_CXX_UNDER_TEST=clang++ --target_board=unix\{-m32,-m64\} 
> compat.exp=pr102024*'
> with gcc/g++ 10.3 and clang 11.  Everything but (expectedly)
> FAIL: gcc.dg/compat/pr102024 c_compat_x_tst.o-c_compat_y_alt.o execute 
> FAIL: gcc.dg/compat/pr102024 c_compat_x_alt.o-c_compat_y_tst.o execute 
> for -m64 ALT_CC_UNDER_TEST=gcc passes.
> 
> Ok for trunk?

OK.

> 2022-03-31  Jakub Jelinek  
> 
>   PR target/102024
>   * gcc.dg/compat/pr102024_test.h: Add further tests with zero sized
>   structures and arrays.
>   * g++.dg/compat/pr102024_test.h: Add further tests with zero sized
>   arrays.
> 
> --- gcc/testsuite/gcc.dg/compat/pr102024_test.h.jj2022-03-24 
> 12:24:41.625100842 +0100
> +++ gcc/testsuite/gcc.dg/compat/pr102024_test.h   2022-03-31 
> 17:36:33.486710917 +0200
> @@ -4,3 +4,9 @@ T(2,int:0;float a;,F(2,a,2.25f,16.5f))
>  T(3,double a;long long:0;double b;,F(3,a,42.0,43.125)F(3,b,-17.5,35.75))
>  T(4,double a;long long:0;,F(4,a,1.0,17.125))
>  T(5,long long:0;double a;,F(5,a,2.25,16.5))
> +T(6,float a;struct{}b;float c;,F(6,a,42.0f,43.125f)F(6,c,-17.5f,35.75f))
> +T(7,float a;struct{}b[0];;,F(7,a,1.0f,17.125f))
> +T(8,int a[0];float b;,F(8,b,2.25f,16.5f))
> +T(9,double a;long long b[0];double c;,F(9,a,42.0,43.125)F(9,c,-17.5,35.75))
> +T(10,double a;struct{}b;,F(10,a,1.0,17.125))
> +T(11,struct{}a[0];double b;,F(11,b,2.25,16.5))
> --- gcc/testsuite/g++.dg/compat/pr102024_test.h.jj2022-03-24 
> 12:24:41.625100842 +0100
> +++ gcc/testsuite/g++.dg/compat/pr102024_test.h   2022-03-31 
> 17:37:30.763877562 +0200
> @@ -4,3 +4,9 @@ T(2,int:0;float a;,F(2,a,2.25f,16.5f))
>  T(3,double a;long long:0;double b;,F(3,a,42.0,43.125)F(3,b,-17.5,35.75))
>  T(4,double a;long long:0;,F(4,a,1.0,17.125))
>  T(5,long long:0;double a;,F(5,a,2.25,16.5))
> +T(6,float a;struct{}b[0];float c;,F(6,a,42.0f,43.125f)F(6,c,-17.5f,35.75f))
> +T(7,float a;struct{}b[0];;,F(7,a,1.0f,17.125f))
> +T(8,int a[0];float b;,F(8,b,2.25f,16.5f))
> +T(9,double a;long long b[0];double c;,F(9,a,42.0,43.125)F(9,c,-17.5,35.75))
> +T(10,double a;struct{}b[0];,F(10,a,1.0,17.125))
> +T(11,struct{}a[0];double b;,F(11,b,2.25,16.5))
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


Re: [PATCH] phiopt: Improve value_replacement [PR104645]

2022-04-01 Thread Richard Biener via Gcc-patches
On Fri, 1 Apr 2022, Jakub Jelinek wrote:

> Hi!
> 
> The following patch fixes the P1 regression by reusing existing
> value_replacement code.  That function already has code to
> handle simple preparation statements (casts, and +,&,|,^ binary
> assignments) before a final binary assignment (which can be
> much wider range of ops).  When we have e.g.
>   if (y_3(D) == 0)
> goto ;
>   else
> goto ;
>  :
>   y_4 = y_3(D) & 31;
>   _1 = (int) y_4;
>   _6 = x_5(D) r<< _1;
>  :
>   # _2 = PHI 
> the preparation statements y_4 = y_3(D) & 31; and
> _1 = (int) y_4; are handled by constant evaluation, passing through
> y_3(D) = 0 initially and propagating that through the assignments
> with checking that UB isn't invoked.  But the final
> _6 = x_5(D) r<< _1; assign is handled differently, either through
> neutral_element_p or absorbing_element_p.
> In the first function below we now have:
>[local count: 1073741824]:
>   if (i_2(D) != 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
>[local count: 536870913]:
>   _3 = i_2(D) & 1;
>   iftmp.0_4 = (int) _3;
> 
>[local count: 1073741824]:
>   # iftmp.0_1 = PHI 
> where in GCC 11 we had:
>:
>   if (i_3(D) != 0)
> goto ; [INV]
>   else
> goto ; [INV]
> 
>:
>   i.1_1 = (int) i_3(D);
>   iftmp.0_5 = i.1_1 & 1;
> 
>:
>   # iftmp.0_2 = PHI 
> Current value_replacement can handle the latter as the last
> stmt of middle_bb is a binary op that in this case satisfies
> absorbing_element_p.
> But the former we can't handle, as the last stmt in middle_bb
> is a cast.
> 
> The patch makes it work in that case by pretending all of middle_bb
> are the preparation statements and there is no binary assign at the
> end, so everything is handled through the constant evaluation.
> We simply set at the start of middle_bb the lhs of comparison
> virtually to the rhs, propagate it through and at the end
> see if virtually the arg0 of the PHI is equal to arg1 of it.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> For GCC 13, I think we just should throw away all the neutral/absorbing
> element stuff and do the constant evaluation of the whole middle_bb
> and handle that way all the ops we currently handle in neutral/absorbing
> element.

Agreed - that would be a nice cleanup.

Thanks,
Richard.

> 2022-04-01  Jakub Jelinek  
> 
>   PR tree-optimization/104645
>   * tree-ssa-phiopt.cc (value_replacement): If assign has
>   CONVERT_EXPR_CODE_P rhs_code, treat it like a preparation
>   statement with constant evaluation.
> 
>   * gcc.dg/tree-ssa/pr104645.c: New test.
> 
> --- gcc/tree-ssa-phiopt.cc.jj 2022-01-18 11:59:00.089974814 +0100
> +++ gcc/tree-ssa-phiopt.cc2022-03-31 14:38:27.537149245 +0200
> @@ -1395,11 +1395,22 @@ value_replacement (basic_block cond_bb,
>  
>gimple *assign = gsi_stmt (gsi);
>if (!is_gimple_assign (assign)
> -  || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS
>|| (!INTEGRAL_TYPE_P (TREE_TYPE (arg0))
> && !POINTER_TYPE_P (TREE_TYPE (arg0
>  return 0;
>  
> +  if (gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS)
> +{
> +  /* If last stmt of the middle_bb is a conversion, handle it like
> +  a preparation statement through constant evaluation with
> +  checking for UB.  */
> +  enum tree_code sc = gimple_assign_rhs_code (assign);
> +  if (CONVERT_EXPR_CODE_P (sc))
> + assign = NULL;
> +  else
> + return 0;
> +}
> +
>/* Punt if there are (degenerate) PHIs in middle_bb, there should not be.  
> */
>if (!gimple_seq_empty_p (phi_nodes (middle_bb)))
>  return 0;
> @@ -1430,7 +1441,8 @@ value_replacement (basic_block cond_bb,
>int prep_cnt;
>for (prep_cnt = 0; ; prep_cnt++)
>  {
> -  gsi_prev_nondebug (&gsi);
> +  if (prep_cnt || assign)
> + gsi_prev_nondebug (&gsi);
>if (gsi_end_p (gsi))
>   break;
>  
> @@ -1450,7 +1462,8 @@ value_replacement (basic_block cond_bb,
> || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
> || !INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
> || !single_imm_use (lhs, &use_p, &use_stmt)
> -   || use_stmt != (prep_cnt ? prep_stmt[prep_cnt - 1] : assign))
> +   || ((prep_cnt || assign)
> +   && use_stmt != (prep_cnt ? prep_stmt[prep_cnt - 1] : assign)))
>   return 0;
>switch (gimple_assign_rhs_code (g))
>   {
> @@ -1483,10 +1496,6 @@ value_replacement (basic_block cond_bb,
>>= 3 * estimate_num_insns (cond, &eni_time_weights))
>  return 0;
>  
> -  tree lhs = gimple_assign_lhs (assign);
> -  tree rhs1 = gimple_assign_rhs1 (assign);
> -  tree rhs2 = gimple_assign_rhs2 (assign);
> -  enum tree_code code_def = gimple_assign_rhs_code (assign);
>tree cond_lhs = gimple_cond_lhs (cond);
>tree cond_rhs = gimple_cond_rhs (cond);
>  
> @@ -1516,16 +1525,39 @@ value_replacement (basic_block cond_bb,
>   return 0;
>  }
>  
> +

[RFC] ipa-cp: Feed results of IPA-CP into SCCVN

2022-04-01 Thread Martin Jambor
Hi,

PRs 68930 and 92497 show that when IPA-CP figures out constants in
aggregate parameters or when passed by reference but the loads happen
in an inlined function the information is lost.  This happens even
when the inlined function itself was known to have - or even cloned to
have - such constants in incoming parameters because the transform
phase of IPA passes is not run on them.  See discussion in the bugs
for reasons why.

Honza suggested that we can plug the results of IPA-CP analysis into
value numbering, so that FRE can figure out that some loads fetch
known constants.  This is what this patch attempts to do.

Although I spent quite some time reading tree-sccvn.c, it is complex
enough that I am sure I am not aware of various caveats and so I would
not be terribly surprised if there were some issues with my approach
that I am not aware of.  Nevertheless, it seems to work well for simple
cases and even passes bootstrap and testing (and LTO bootstrap) on
x86_64-linux.

I have experimented a little with using this approach instead of the
function walking parts of the IPA-CP transformation phase.  This would
mean that the known-constants would not participate in the passes after
IPA but before FRE - which are not many but there is a ccp and fwprop
pass among others.  For simple testcases like
gcc/testsuite/gcc.dg/ipa/ipcp-agg-*.c, it makes not assembly difference
at all.

What do you think?

Martin


gcc/ChangeLog:

2022-03-30  Martin Jambor  

PR ipa/68930
PR ipa/92497
* ipa-prop.cc (ipcp_get_aggregate_const): New function.
(ipcp_transform_function): Do not deallocate transformation info.
* ipa-prop.h (ipcp_get_aggregate_const): Declare.
* tree-ssa-sccvn.cc: Include alloc-pool.h, symbol-summary.h and
ipa-prop.h.
(vn_reference_lookup_2): When hitting default-def vuse, query
IPA-CP transformation info for any known constants.

gcc/testsuite/ChangeLog:

2022-03-30  Martin Jambor  

PR ipa/68930
PR ipa/92497
* gcc.dg/ipa/pr92497-1.c: New test.
* gcc.dg/ipa/pr92497-2.c: Likewise.
---
 gcc/ipa-prop.cc  | 43 
 gcc/ipa-prop.h   |  2 ++
 gcc/testsuite/gcc.dg/ipa/pr92497-1.c | 26 +
 gcc/testsuite/gcc.dg/ipa/pr92497-2.c | 26 +
 gcc/tree-ssa-sccvn.cc| 35 +-
 5 files changed, 126 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-1.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-2.c

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index e55fe2776f2..a73a5d9ec1d 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -5748,6 +5748,44 @@ ipcp_modif_dom_walker::before_dom_children (basic_block 
bb)
   return NULL;
 }
 
+/* If IPA-CP discovered a constant in parameter PARM at OFFSET of a given SIZE
+   - whether passed by reference or not is given by BY_REF - return that
+   constant.  Otherwise return NULL_TREE.  */
+
+tree
+ipcp_get_aggregate_const (tree parm, bool by_ref,
+ HOST_WIDE_INT offset, HOST_WIDE_INT size)
+{
+  cgraph_node *cnode = cgraph_node::get (current_function_decl);
+
+  ipa_agg_replacement_value *aggval = ipa_get_agg_replacements_for_node 
(cnode);
+  if (!aggval)
+return NULL_TREE;
+
+  int index = 0;
+  for (tree p = DECL_ARGUMENTS (current_function_decl);
+   p != parm; p = DECL_CHAIN (p))
+{
+  index++;
+  if (!p)
+   return NULL_TREE;
+}
+
+  ipa_agg_replacement_value *v;
+  for (v = aggval; v; v = v->next)
+if (v->index == index
+   && v->offset == offset)
+  break;
+  if (!v
+  || v->by_ref != by_ref
+  || maybe_ne (tree_to_poly_int64 (TYPE_SIZE (TREE_TYPE (v->value))),
+  size))
+return NULL_TREE;
+
+  return v->value;
+}
+
+
 /* Return true if we have recorded VALUE and MASK about PARM.
Set VALUE and MASk accordingly.  */
 
@@ -6055,11 +6093,6 @@ ipcp_transform_function (struct cgraph_node *node)
 free_ipa_bb_info (bi);
   fbi.bb_infos.release ();
 
-  ipcp_transformation *s = ipcp_transformation_sum->get (node);
-  s->agg_values = NULL;
-  s->bits = NULL;
-  s->m_vr = NULL;
-
   vec_free (descriptors);
   if (cfg_changed)
 delete_unreachable_blocks_update_callgraph (node, false);
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 553adfc9f35..aa6fcb522ac 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -1181,6 +1181,8 @@ void ipa_dump_param (FILE *, class ipa_node_params *info, 
int i);
 void ipa_release_body_info (struct ipa_func_body_info *);
 tree ipa_get_callee_param_type (struct cgraph_edge *e, int i);
 bool ipcp_get_parm_bits (tree, tree *, widest_int *);
+tree ipcp_get_aggregate_const (tree parm, bool by_ref,
+  HOST_WIDE_INT offset, HOST_WIDE_INT size);
 bool unadjusted_ptr_and_unit_offset (tree op, tree *ret,
 poly_int64 *offset_ret);
 
dif

[committed] libstdc++: Fix filenames in Doxygen @file comments

2022-04-01 Thread Jonathan Wakely via Gcc-patches
From: Timm Bäder 

Pushed to trunk.

-- >8 --

Reviewed-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/fs_ops.h: Fix filename in Doxygen comment.
* include/experimental/bits/fs_ops.h: Likewise.
---
 libstdc++-v3/include/bits/fs_ops.h  | 2 +-
 libstdc++-v3/include/experimental/bits/fs_ops.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/fs_ops.h 
b/libstdc++-v3/include/bits/fs_ops.h
index c894cae8aa3..0281c6540d0 100644
--- a/libstdc++-v3/include/bits/fs_ops.h
+++ b/libstdc++-v3/include/bits/fs_ops.h
@@ -22,7 +22,7 @@
 // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // .
 
-/** @file include/bits/fs_fwd.h
+/** @file include/bits/fs_ops.h
  *  This is an internal header file, included by other library headers.
  *  Do not attempt to use it directly. @headername{filesystem}
  */
diff --git a/libstdc++-v3/include/experimental/bits/fs_ops.h 
b/libstdc++-v3/include/experimental/bits/fs_ops.h
index dafd1ec79a0..773f27c6687 100644
--- a/libstdc++-v3/include/experimental/bits/fs_ops.h
+++ b/libstdc++-v3/include/experimental/bits/fs_ops.h
@@ -22,7 +22,7 @@
 // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // .
 
-/** @file experimental/bits/fs_fwd.h
+/** @file experimental/bits/fs_ops.h
  *  This is an internal header file, included by other library headers.
  *  Do not attempt to use it directly. @headername{experimental/filesystem}
  */
-- 
2.34.1



[PATCH] tree-optimization/100810 - avoid undefs in IVOPT rewrites

2022-04-01 Thread Richard Biener via Gcc-patches
The following attempts to avoid IVOPTs rewriting uses using
IV candidates that involve undefined behavior by using uninitialized
SSA names.  First we restrict the set of candidates we produce
for such IVs to the original ones and mark them as not important.
Second we try to only allow expressing uses with such IV if they
originally use them.  That is to avoid rewriting all such uses
in terms of other IVs.  Since cand->iv and use->iv seem to never
exactly match up we resort to comparing the IV bases.

The approach ends up similar to the one posted by Roger at
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578441.html
but it marks IV candidates rather than use groups and the cases
we allow in determine_group_iv_cost_generic are slightly different.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK for trunk?

Thanks,
Richard.

2022-01-04  Richard Biener  

PR tree-optimization/100810
* tree-ssa-loop-ivopts.cc (struct iv_cand): Add involves_undefs flag.
(find_ssa_undef): New function.
(add_candidate_1): Avoid adding derived candidates with
undefined SSA names and mark the original ones.
(determine_group_iv_cost_generic): Reject rewriting
uses with a different IV when that involves undefined SSA names.

* gcc.dg/torture/pr100810.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr100810.c | 34 +
 gcc/tree-ssa-loop-ivopts.cc | 31 ++
 2 files changed, 65 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr100810.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr100810.c 
b/gcc/testsuite/gcc.dg/torture/pr100810.c
new file mode 100644
index 000..63566f530f7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr100810.c
@@ -0,0 +1,34 @@
+/* { dg-do run } */
+
+int a, b = 1, c = 1, e, f = 1, g, h, j;
+volatile int d;
+static void k()
+{
+  int i;
+  h = b;
+  if (c && a >= 0) {
+  while (a) {
+ i++;
+ h--;
+  }
+  if (g)
+   for (h = 0; h < 2; h++)
+ ;
+  if (!b)
+   i &&d;
+  }
+}
+static void l()
+{
+  for (; j < 1; j++)
+if (!e && c && f)
+  k();
+}
+int main()
+{
+  if (f)
+l();
+  if (h != 1)
+__builtin_abort();
+  return 0;
+}
diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
index 935d2d4d8f3..b0305c494cd 100644
--- a/gcc/tree-ssa-loop-ivopts.cc
+++ b/gcc/tree-ssa-loop-ivopts.cc
@@ -452,6 +452,7 @@ struct iv_cand
   unsigned id; /* The number of the candidate.  */
   bool important;  /* Whether this is an "important" candidate, i.e. such
   that it should be considered by all uses.  */
+  bool involves_undefs; /* Whether the IV involves undefined values.  */
   ENUM_BITFIELD(iv_position) pos : 8;  /* Where it is computed.  */
   gimple *incremented_at;/* For original biv, the statement where it is
   incremented.  */
@@ -3068,6 +3069,19 @@ get_loop_invariant_expr (struct ivopts_data *data, tree 
inv_expr)
   return *slot;
 }
 
+/* Find the first undefined SSA name in *TP.  */
+
+static tree
+find_ssa_undef (tree *tp, int *walk_subtrees, void *)
+{
+  if (TREE_CODE (*tp) == SSA_NAME
+  && ssa_undefined_value_p (*tp, false))
+return *tp;
+  if (!EXPR_P (*tp))
+*walk_subtrees = 0;
+  return NULL;
+}
+
 /* Adds a candidate BASE + STEP * i.  Important field is set to IMPORTANT and
position to POS.  If USE is not NULL, the candidate is set as related to
it.  If both BASE and STEP are NULL, we add a pseudocandidate for the
@@ -3095,6 +3109,17 @@ add_candidate_1 (struct ivopts_data *data, tree base, 
tree step, bool important,
   if (flag_keep_gc_roots_live && POINTER_TYPE_P (TREE_TYPE (base)))
 return NULL;
 
+  /* If BASE contains undefined SSA names make sure we only record
+ the original IV.  */
+  bool involves_undefs = false;
+  if (walk_tree (&base, find_ssa_undef, NULL, NULL))
+{
+  if (pos != IP_ORIGINAL)
+   return NULL;
+  important = false;
+  involves_undefs = true;
+}
+
   /* For non-original variables, make sure their values are computed in a type
  that does not invoke undefined behavior on overflows (since in general,
  we cannot prove that these induction variables are non-wrapping).  */
@@ -3143,6 +3168,7 @@ add_candidate_1 (struct ivopts_data *data, tree base, 
tree step, bool important,
  cand->var_after = cand->var_before;
}
   cand->important = important;
+  cand->involves_undefs = involves_undefs;
   cand->incremented_at = incremented_at;
   cand->doloop_p = doloop;
   data->vcands.safe_push (cand);
@@ -4956,6 +4982,11 @@ determine_group_iv_cost_generic (struct ivopts_data 
*data,
  the candidate.  */
   if (cand->pos == IP_ORIGINAL && cand->incremented_at == use->stmt)
 cost = no_cost;
+  /* If the IV candidate involves undefined SSA values and is not the
+ same IV as on the USE avoid using that c

[committed][libgomp, testsuite, nvptx] Fix dg-output test in vector-length-128-7.c

2022-04-01 Thread Tom de Vries via Gcc-patches
Hi,

When running test-case libgomp.oacc-c-c++-common/vector-length-128-7.c on an
RTX A2000 (sm_86) with driver 510.60.02 I run into:
...
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-7.c \
  -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  \
  output pattern test
...

The failing check verifies the launch dimensions:
...
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: \
launch gangs=1, workers=8, vectors=128" } */
...
which fails because (as we can see with GOMP_DEBUG=1) the actual num_workers
is 6:
...
  nvptx_exec: kernel main$_omp_fn$0: launch gangs=1, workers=6, vectors=128
...

This is due to the result of cuOccupancyMaxPotentialBlockSize (which suggests
'a launch configuration with reasonable occupancy') printed just before:
...
cuOccupancyMaxPotentialBlockSize: grid = 52, block = 768
...
[ Note: 6 * 128 == 768. ]

Fix this by updating the check to allow num_workers in the range 1 to 8.

Tested on x86_64 with nvptx accelerator.

Committed to trunk.

Thanks,
- Tom

[libgomp, testsuite, nvptx] Fix dg-output test in vector-length-128-7.c

libgomp/ChangeLog:

2022-04-01  Tom de Vries  

* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Fix
num_workers check.

---
 libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c 
b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c
index 4a8c1bf549e..92b3de03636 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c
@@ -37,4 +37,4 @@ main (void)
 }
 
 /* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 
0, 128\\)" "oaccloops" } } */
-/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, 
workers=8, vectors=128" } */
+/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, 
workers=\[1-8\], vectors=128" } */


[PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-01 Thread Tom de Vries via Gcc-patches
Hi,

When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on
an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run
into:
...
FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 \
  -DGOMP_NVPTX_JIT=-O0 execution test
FAIL: libgomp.fortran/examples-4/declare_target-2.f90 -O0 \
  -DGOMP_NVPTX_JIT=-O0 execution test
...

Fix this by further limiting recursion depth in the test-cases for nvptx.

Furthermore, make the recursion depth limiting nvptx-specific.

Tested on x86_64 with nvptx accelerator.

Any comments?

Thanks,
- Tom

[libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

libgomp/ChangeLog:

2022-04-01  Tom de Vries  

* testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Define
and use REC_DEPTH.
* testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.

---
 .../libgomp.fortran/examples-4/declare_target-1.f90  | 18 +-
 .../libgomp.fortran/examples-4/declare_target-2.f90  | 20 ++--
 2 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90 
b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
index b761979ecde..03c5c53ed67 100644
--- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
@@ -1,4 +1,16 @@
 ! { dg-do run }
+! { dg-additional-options "-cpp" }
+! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+! Nvidia Titan V.
+! Reduced from 23 to 22, otherwise execution runs out of thread stack on
+! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
+! Reduced from 22 to 20, otherwise execution runs out of thread stack on
+! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
+! { dg-additional-options "-DREC_DEPTH=20" { target { offload_target_nvptx } } 
} */
+
+#ifndef REC_DEPTH
+#define REC_DEPTH 25
+#endif
 
 module e_53_1_mod
   integer :: THRESHOLD = 20
@@ -27,9 +39,5 @@ end module
 program e_53_1
   use e_53_1_mod, only : fib, fib_wrapper
   if (fib (15) /= fib_wrapper (15)) stop 1
-  ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
-  ! Nvidia Titan V.
-  ! Reduced from 23 to 22, otherwise execution runs out of thread stack on
-  ! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
-  if (fib (22) /= fib_wrapper (22)) stop 2
+  if (fib (REC_DEPTH) /= fib_wrapper (REC_DEPTH)) stop 2
 end program
diff --git a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90 
b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
index f576c25ba39..0e8bea578a8 100644
--- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
@@ -1,16 +1,24 @@
 ! { dg-do run }
+! { dg-additional-options "-cpp" }
+! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+! Nvidia Titan V.
+! Reduced from 23 to 22, otherwise execution runs out of thread stack on
+! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
+! Reduced from 22 to 18, otherwise execution runs out of thread stack on
+! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
+! { dg-additional-options "-DREC_DEPTH=18" { target { offload_target_nvptx } } 
} */
+
+#ifndef REC_DEPTH
+#define REC_DEPTH 25
+#endif
 
 program e_53_2
   !$omp declare target (fib)
   integer :: x, fib
   !$omp target map(from: x)
-! Reduced from 25 to 23, otherwise execution runs out of thread stack on
-! Nvidia Titan V.
-! Reduced from 23 to 22, otherwise execution runs out of thread stack on
-! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
-x = fib (22)
+x = fib (REC_DEPTH)
   !$omp end target
-  if (x /= fib (22)) stop 1
+  if (x /= fib (REC_DEPTH)) stop 1
 end program
 
 integer recursive function fib (n) result (f)


Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Apr 01, 2022 at 01:24:40PM +0200, Tom de Vries wrote:
> Hi,
> 
> When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on
> an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run
> into:
> ...
> FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 \
>   -DGOMP_NVPTX_JIT=-O0 execution test
> FAIL: libgomp.fortran/examples-4/declare_target-2.f90 -O0 \
>   -DGOMP_NVPTX_JIT=-O0 execution test
> ...
> 
> Fix this by further limiting recursion depth in the test-cases for nvptx.
> 
> Furthermore, make the recursion depth limiting nvptx-specific.
> 
> Tested on x86_64 with nvptx accelerator.
> 
> Any comments?
> 
> Thanks,
> - Tom
> 
> [libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90
> 
> libgomp/ChangeLog:
> 
> 2022-04-01  Tom de Vries  
> 
>   * testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Define
>   and use REC_DEPTH.
>   * testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.

Ok.

Jakub



Re: [PATCH] libstdc++: Implement std::unreachable() for C++23 (P0627R6)

2022-04-01 Thread Jonathan Wakely via Gcc-patches
On Thu, 31 Mar 2022 at 19:21, Marc Glisse wrote:
>
> On Thu, 31 Mar 2022, Jonathan Wakely wrote:
>
> > On Thu, 31 Mar 2022 at 17:03, Marc Glisse via Libstdc++
> >  wrote:
> >>
> >> On Thu, 31 Mar 2022, Matthias Kretz via Gcc-patches wrote:
> >>
> >>> I like it. But I'd like it even more if we could have
> >>>
> >>> #elif defined _UBSAN
> >>>__ubsan_invoke_ub("reached std::unreachable()");
> >>>
> >>> But to my knowledge UBSAN has no hooks for the library like this (yet).
> >>
> >> -fsanitize=undefined already replaces __builtin_unreachable with its own
> >> thing, so I was indeed going to ask if the assertion / trap provide a
> >> better debugging experience compared to plain __builtin_unreachable, with
> >> the possibility to get a stack trace (UBSAN_OPTIONS=print_stacktrace=1),
> >> etc? Detecting if (the right subset of) ubsan is enabled sounds like a
> >> good idea.
> >
> > Does UBsan define a macro that we can use to detect it?
>
> https://github.com/google/sanitizers/issues/765 seems to say no (it could
> be outdated though), but they were asking for use cases to motivate adding
> one. Apparently there is a macro for clang, although I don't think it is
> fine-grained.
>
> Adding one to cppbuiltin.cc testing SANITIZE_UNREACHABLE looks easy, maybe
> we can do just this one, we don't need to go overboard and define macros
> for all possible suboptions of ubsan right now.

Yes, we should only add what there's a use case for.

> I don't think any of that prevents from pushing your patch as is for
> gcc-12.

Matthias didn't like my Princess Bride easter egg :-)
Would the attached be better?
commit e2b2cf6319406bc9cb9361962cf7c31b1848ebe8
Author: Jonathan Wakely 
Date:   Fri Apr 1 12:25:02 2022

libstdc++: Implement std::unreachable() for C++23 (P0627R6)

This defines std::unreachable as an assertion for debug mode, a trap
when _GLIBCXX_ASSERTIONS is defined, and __builtin_unreachable()
otherwise.

The reason for only using __builtin_trap() in the second case is to
avoid the overhead of setting up a call to __glibcxx_assert_fail that
should never happen.

UBsan can detect if __builtin_unreachable() is executed, so if a feature
test macro for that sanitizer is added, we could change just use
__builtin_unreachable() when the sanitizer is enabled.

While thinking about what the debug assertion failure should print, I
noticed that the __glibcxx_assert_fail function doesn't check for null
pointers. This adds a check so we don't try to print them if null.

libstdc++-v3/ChangeLog:

* include/std/utility (unreachable): Define for C++23.
* include/std/version (__cpp_lib_unreachable): Define.
* src/c++11/debug.cc (__glibcxx_assert_fail): Check for valid
arguments. Handle only the function being given.
* testsuite/20_util/unreachable/1.cc: New test.
* testsuite/20_util/unreachable/version.cc: New test.

diff --git a/libstdc++-v3/include/std/utility b/libstdc++-v3/include/std/utility
index 0d7f8954c5a..ad5faa50f57 100644
--- a/libstdc++-v3/include/std/utility
+++ b/libstdc++-v3/include/std/utility
@@ -186,6 +186,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr underlying_type_t<_Tp>
 to_underlying(_Tp __value) noexcept
 { return static_cast>(__value); }
+
+#define __cpp_lib_unreachable 202202L
+  /// Informs the compiler that program control flow never reaches this point.
+  /**
+   * Evaluating a call to this function results in undefined behaviour.
+   * This can be used as an assertion informing the compiler that certain
+   * conditions are impossible, for when the compiler is unable to determine
+   * that by itself.
+   *
+   * For example, it can be used to prevent warnings about reaching the
+   * end of a non-void function without returning.
+   *
+   * @since C++23
+   */
+  [[noreturn,__gnu__::__always_inline__]]
+  inline void
+  unreachable()
+  {
+#ifdef _GLIBCXX_DEBUG
+std::__glibcxx_assert_fail(nullptr, 0, "std::unreachable()", nullptr);
+#elif defined _GLIBCXX_ASSERTIONS
+__builtin_trap();
+#else
+__builtin_unreachable();
+#endif
+  }
 #endif // C++23
 #endif // C++20
 #endif // C++17
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index 44b8a9f88b5..51f2110b68e 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -326,6 +326,7 @@
 # define __cpp_lib_string_resize_and_overwrite 202110L
 #endif
 #define __cpp_lib_to_underlying 202102L
+#define __cpp_lib_unreachable 202202L
 #endif
 #endif // C++2b
 #endif // C++20
diff --git a/libstdc++-v3/src/c++11/debug.cc b/libstdc++-v3/src/c++11/debug.cc
index 98fe2dcc153..4706defedf1 100644
--- a/libstdc++-v3/src/c++11/debug.cc
+++ b/libstdc++-v3/src/c++11/debug.cc
@@ -52,8 +52,11 @@ namespace std
   __glibcxx_assert_fail(const char* file, int line,
const char* function, const char* condition) noe

[PATCH v2] mips: Emit psabi diagnostic for return values affected by C++ zero-width bit-field ABI change [PR 102024]

2022-04-01 Thread Xi Ruoyao via Gcc-patches
v1 -> v2:

* "int has_zero_width_bf" -> "bool has_cxx_zero_width_bf".  "int" to
"bool" because the value is 0/1 only.  Add "cxx" because it only
indicates C++ zero-width bit-fields (not those bit-fields from C).

* Coding style fix.

* Rewrite mips_return_in_msb so mips_fpr_return_fields is not called
unnecessarily.

* "retcon" -> "change".

gcc/
PR target/102024
* mips.cc (mips_fpr_return_fields): Detect C++ zero-width
bit-fields and set up an indicator.
(mips_return_in_msb): Adapt for mips_fpr_return_fields change.
(mips_function_value_1): Diagnose when the presense of a C++
zero-width bit-field changes function returning in GCC 12.

gcc/testsuite/
PR target/102024
* g++.target/mips/mips.exp: New test supporting file.
* g++.target/mips/pr102024-1.C: New test.
---
 gcc/config/mips/mips.cc  | 58 
 gcc/testsuite/g++.target/mips/mips.exp   | 34 ++
 gcc/testsuite/g++.target/mips/pr102024.C | 20 
 3 files changed, 104 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/mips/mips.exp
 create mode 100644 gcc/testsuite/g++.target/mips/pr102024.C

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 91e1e964f94..83860b5d4b7 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -6274,10 +6274,17 @@ mips_callee_copies (cumulative_args_t, const 
function_arg_info &arg)
 
For n32 & n64, a structure with one or two fields is returned in
floating-point registers as long as every field has a floating-point
-   type.  */
+   type.
+
+   The C++ FE used to remove zero-width bit-fields in GCC 11 and earlier.
+   To make a proper diagnostic, this function will set
+   HAS_CXX_ZERO_WIDTH_BF to true once a C++ zero-width bit-field shows up,
+   and then ignore it. Then the caller can determine if this zero-width
+   bit-field will make a difference and emit a -Wpsabi inform.  */
 
 static int
-mips_fpr_return_fields (const_tree valtype, tree *fields)
+mips_fpr_return_fields (const_tree valtype, tree *fields,
+   bool *has_cxx_zero_width_bf)
 {
   tree field;
   int i;
@@ -6294,6 +6301,12 @@ mips_fpr_return_fields (const_tree valtype, tree *fields)
   if (TREE_CODE (field) != FIELD_DECL)
continue;
 
+  if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field))
+   {
+ *has_cxx_zero_width_bf = true;
+ continue;
+   }
+
   if (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (field)))
return 0;
 
@@ -6318,12 +6331,14 @@ mips_fpr_return_fields (const_tree valtype, tree 
*fields)
 static bool
 mips_return_in_msb (const_tree valtype)
 {
-  tree fields[2];
+  if (!TARGET_NEWABI || !TARGET_BIG_ENDIAN || !AGGREGATE_TYPE_P (valtype))
+return false;
 
-  return (TARGET_NEWABI
- && TARGET_BIG_ENDIAN
- && AGGREGATE_TYPE_P (valtype)
- && mips_fpr_return_fields (valtype, fields) == 0);
+  tree fields[2];
+  bool has_cxx_zero_width_bf = false;
+  return (mips_fpr_return_fields (valtype, fields,
+ &has_cxx_zero_width_bf) == 0
+ || has_cxx_zero_width_bf);
 }
 
 /* Return true if the function return value MODE will get returned in a
@@ -6418,8 +6433,35 @@ mips_function_value_1 (const_tree valtype, const_tree 
fn_decl_or_type,
 return values, promote the mode here too.  */
   mode = promote_function_mode (valtype, mode, &unsigned_p, func, 1);
 
+  bool has_cxx_zero_width_bf = false;
+  int use_fpr = mips_fpr_return_fields (valtype, fields,
+   &has_cxx_zero_width_bf);
+  if (TARGET_HARD_FLOAT
+ && warn_psabi
+ && has_cxx_zero_width_bf
+ && use_fpr != 0)
+   {
+ static unsigned last_reported_type_uid;
+ unsigned uid = TYPE_UID (TYPE_MAIN_VARIANT (valtype));
+ if (uid != last_reported_type_uid)
+   {
+ static const char *url
+   = CHANGES_ROOT_URL
+ "gcc-12/changes.html#zero_width_bitfields";
+ inform (input_location,
+ "the ABI for returning a value containing "
+ "zero-width bit-fields but otherwise an aggregate "
+ "with only one or two floating-point fields was "
+ "changed in GCC %{12.1%}", url);
+ last_reported_type_uid = uid;
+   }
+   }
+
+  if (has_cxx_zero_width_bf)
+   use_fpr = 0;
+
   /* Handle structures whose fields are returned in $f0/$f2.  */
-  switch (mips_fpr_return_fields (valtype, fields))
+  switch (use_fpr)
{
case 1:
  return mips_return_fpr_single (mode,
diff --git a/gcc/testsuite/g++.target/mips/mips.exp 
b/gcc/testsuite/g++.target/mips/mips.exp
new file mode 100644
index 000..9fa7e771b4d
--- /dev/null
+++ b/gcc/testsuite/g++.target/mips/mips.exp
@@ -0,0 +1,34 @@
+# Copyright (C) 2019-2022 Free 

Re: [PATCH v2] mips: Emit psabi diagnostic for return values affected by C++ zero-width bit-field ABI change [PR 102024]

2022-04-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Apr 01, 2022 at 07:38:59PM +0800, Xi Ruoyao wrote:
> v1 -> v2:
> 
> * "int has_zero_width_bf" -> "bool has_cxx_zero_width_bf".  "int" to
> "bool" because the value is 0/1 only.  Add "cxx" because it only
> indicates C++ zero-width bit-fields (not those bit-fields from C).
> 
> * Coding style fix.
> 
> * Rewrite mips_return_in_msb so mips_fpr_return_fields is not called
> unnecessarily.
> 
> * "retcon" -> "change".
> 
> gcc/
>   PR target/102024
>   * mips.cc (mips_fpr_return_fields): Detect C++ zero-width
>   bit-fields and set up an indicator.
>   (mips_return_in_msb): Adapt for mips_fpr_return_fields change.
>   (mips_function_value_1): Diagnose when the presense of a C++
>   zero-width bit-field changes function returning in GCC 12.
> 
> gcc/testsuite/
>   PR target/102024
>   * g++.target/mips/mips.exp: New test supporting file.
>   * g++.target/mips/pr102024-1.C: New test.

LGTM, thanks.

Jakub



Re: [RFC] ipa-cp: Feed results of IPA-CP into SCCVN

2022-04-01 Thread Richard Biener via Gcc-patches
On Fri, 1 Apr 2022, Martin Jambor wrote:

> Hi,
> 
> PRs 68930 and 92497 show that when IPA-CP figures out constants in
> aggregate parameters or when passed by reference but the loads happen
> in an inlined function the information is lost.  This happens even
> when the inlined function itself was known to have - or even cloned to
> have - such constants in incoming parameters because the transform
> phase of IPA passes is not run on them.  See discussion in the bugs
> for reasons why.
> 
> Honza suggested that we can plug the results of IPA-CP analysis into
> value numbering, so that FRE can figure out that some loads fetch
> known constants.  This is what this patch attempts to do.
> 
> Although I spent quite some time reading tree-sccvn.c, it is complex
> enough that I am sure I am not aware of various caveats and so I would
> not be terribly surprised if there were some issues with my approach
> that I am not aware of.  Nevertheless, it seems to work well for simple
> cases and even passes bootstrap and testing (and LTO bootstrap) on
> x86_64-linux.
> 
> I have experimented a little with using this approach instead of the
> function walking parts of the IPA-CP transformation phase.  This would
> mean that the known-constants would not participate in the passes after
> IPA but before FRE - which are not many but there is a ccp and fwprop
> pass among others.  For simple testcases like
> gcc/testsuite/gcc.dg/ipa/ipcp-agg-*.c, it makes not assembly difference
> at all.
> 
> What do you think?

Comments below

> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2022-03-30  Martin Jambor  
> 
>   PR ipa/68930
>   PR ipa/92497
>   * ipa-prop.cc (ipcp_get_aggregate_const): New function.
>   (ipcp_transform_function): Do not deallocate transformation info.
>   * ipa-prop.h (ipcp_get_aggregate_const): Declare.
>   * tree-ssa-sccvn.cc: Include alloc-pool.h, symbol-summary.h and
>   ipa-prop.h.
>   (vn_reference_lookup_2): When hitting default-def vuse, query
>   IPA-CP transformation info for any known constants.
> 
> gcc/testsuite/ChangeLog:
> 
> 2022-03-30  Martin Jambor  
> 
>   PR ipa/68930
>   PR ipa/92497
>   * gcc.dg/ipa/pr92497-1.c: New test.
>   * gcc.dg/ipa/pr92497-2.c: Likewise.
> ---
>  gcc/ipa-prop.cc  | 43 
>  gcc/ipa-prop.h   |  2 ++
>  gcc/testsuite/gcc.dg/ipa/pr92497-1.c | 26 +
>  gcc/testsuite/gcc.dg/ipa/pr92497-2.c | 26 +
>  gcc/tree-ssa-sccvn.cc| 35 +-
>  5 files changed, 126 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-2.c
> 
> diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
> index e55fe2776f2..a73a5d9ec1d 100644
> --- a/gcc/ipa-prop.cc
> +++ b/gcc/ipa-prop.cc
> @@ -5748,6 +5748,44 @@ ipcp_modif_dom_walker::before_dom_children 
> (basic_block bb)
>return NULL;
>  }
>  
> +/* If IPA-CP discovered a constant in parameter PARM at OFFSET of a given 
> SIZE
> +   - whether passed by reference or not is given by BY_REF - return that
> +   constant.  Otherwise return NULL_TREE.  */
> +
> +tree
> +ipcp_get_aggregate_const (tree parm, bool by_ref,
> +   HOST_WIDE_INT offset, HOST_WIDE_INT size)

I'd prefer to pass in the function decl or struct function or
cgraph node.

> +{
> +  cgraph_node *cnode = cgraph_node::get (current_function_decl);
> +
> +  ipa_agg_replacement_value *aggval = ipa_get_agg_replacements_for_node 
> (cnode);
> +  if (!aggval)
> +return NULL_TREE;
> +
> +  int index = 0;
> +  for (tree p = DECL_ARGUMENTS (current_function_decl);
> +   p != parm; p = DECL_CHAIN (p))
> +{
> +  index++;
> +  if (!p)
> + return NULL_TREE;
> +}
> +
> +  ipa_agg_replacement_value *v;
> +  for (v = aggval; v; v = v->next)
> +if (v->index == index
> + && v->offset == offset)
> +  break;
> +  if (!v
> +  || v->by_ref != by_ref
> +  || maybe_ne (tree_to_poly_int64 (TYPE_SIZE (TREE_TYPE (v->value))),
> +size))
> +return NULL_TREE;

two linear searches here - ugh.  I wonder if we should instead
pre-fill a hash-map from PARM_DECL to a ipa_agg_replacement_value *
vector sorted by offset which we can binary search?  That could be
done once when starting value-numbering (not on regions).  Is
there any reason the data structure is as it is?  It seems that
even ipcp_modif_dom_walker::before_dom_children will do a linear
walk and ipa_get_param_decl_index_1 also linearly walks parameters.

That looks highly sub-optimal to me, it's also done for each
mention of a parameter.

> +  return v->value;
> +}
> +
> +
>  /* Return true if we have recorded VALUE and MASK about PARM.
> Set VALUE and MASk accordingly.  */
>  
> @@ -6055,11 +6093,6 @@ ipcp_transform_function (struct cgraph_node *node)
>  free_ipa_bb_info (bi);
>fbi.bb_infos.release ();
>  
> 

Re: [PATCH] libstdc++: Implement std::unreachable() for C++23 (P0627R6)

2022-04-01 Thread Matthias Kretz via Gcc-patches
On Friday, 1 April 2022 13:33:42 CEST Jonathan Wakely wrote:
> Matthias didn't like my Princess Bride easter egg :-)
> Would the attached be better?

LGTM.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


[PATCH v2] Ignore zero width fields in arguments and issue -Wpsabi warning about C zero-width field ABI changes [PR102024]

2022-04-01 Thread Xi Ruoyao via Gcc-patches
v1 -> v2:

* "int has_zero_width_bf_abi_change" -> "bool
zero_width_field_abi_change".  "int" -> "bool" because it's only 0/1,
"bf" -> "field" because the change also affects zero-length arrays and
empty structs/unions, etc.

* Add tests with zero-length array and empty struct.

* Coding style fix.

* "#zero_width_bitfields" -> "#mips_zero_width_fields" because this is
not the exactly same change documented by #zero_width_bitfields.  I'll
send a wwwdoc patch after this is approved.

gcc/
PR target/102024
* mips.cc (mips_function_arg): Ignore zero-width fields, and
inform if it causes a psABI change.

gcc/testsuite/
PR target/102024
* gcc.target/mips/pr102024-1.c: New test.
* gcc.target/mips/pr102024-2.c: New test.
* gcc.target/mips/pr102024-3.c: New test.
---
 gcc/config/mips/mips.cc| 46 --
 gcc/testsuite/gcc.target/mips/pr102024-1.c | 20 ++
 gcc/testsuite/gcc.target/mips/pr102024-2.c | 20 ++
 gcc/testsuite/gcc.target/mips/pr102024-3.c | 20 ++
 4 files changed, 102 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/pr102024-1.c
 create mode 100644 gcc/testsuite/gcc.target/mips/pr102024-2.c
 create mode 100644 gcc/testsuite/gcc.target/mips/pr102024-3.c

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 83860b5d4b7..7681983186c 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -6042,11 +6042,27 @@ mips_function_arg (cumulative_args_t cum_v, const 
function_arg_info &arg)
  for (i = 0; i < info.reg_words; i++)
{
  rtx reg;
+ bool zero_width_field_abi_change = false;
 
  for (; field; field = DECL_CHAIN (field))
-   if (TREE_CODE (field) == FIELD_DECL
-   && int_bit_position (field) >= bitpos)
- break;
+   {
+ if (TREE_CODE (field) != FIELD_DECL)
+   continue;
+
+ /* Ignore zero-width fields.  And, if the ignored
+field is not a C++ zero-width bit-field, it may be
+an ABI change.  */
+ if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field))
+   continue;
+ if (integer_zerop (DECL_SIZE (field)))
+   {
+ zero_width_field_abi_change = true;
+ continue;
+   }
+
+ if (int_bit_position (field) >= bitpos)
+   break;
+   }
 
  if (field
  && int_bit_position (field) == bitpos
@@ -6054,7 +6070,29 @@ mips_function_arg (cumulative_args_t cum_v, const 
function_arg_info &arg)
  && TYPE_PRECISION (TREE_TYPE (field)) == BITS_PER_WORD)
reg = gen_rtx_REG (DFmode, FP_ARG_FIRST + info.reg_offset + i);
  else
-   reg = gen_rtx_REG (DImode, GP_ARG_FIRST + info.reg_offset + i);
+   {
+ reg = gen_rtx_REG (DImode,
+GP_ARG_FIRST + info.reg_offset + i);
+ zero_width_field_abi_change = false;
+   }
+
+ if (zero_width_field_abi_change && warn_psabi)
+   {
+ static unsigned last_reported_type_uid;
+ unsigned uid = TYPE_UID (TYPE_MAIN_VARIANT (arg.type));
+ if (uid != last_reported_type_uid)
+   {
+ static const char *url
+   = CHANGES_ROOT_URL
+ "gcc-12/changes.html#mips_zero_width_fields";
+ inform (input_location,
+ "the ABI for passing a value containing "
+ "zero-width fields before an adjacent "
+ "64-bit floating-point field was changed "
+ "in GCC %{12.1%}", url);
+ last_reported_type_uid = uid;
+   }
+   }
 
  XVECEXP (ret, 0, i)
= gen_rtx_EXPR_LIST (VOIDmode, reg,
diff --git a/gcc/testsuite/gcc.target/mips/pr102024-1.c 
b/gcc/testsuite/gcc.target/mips/pr102024-1.c
new file mode 100644
index 000..cf442863fc2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/pr102024-1.c
@@ -0,0 +1,20 @@
+// PR target/102024
+// { dg-do compile }
+// { dg-options "-mabi=64 -mhard-float" }
+// { dg-final { scan-assembler "\\\$f12" } }
+
+struct foo
+{
+  int : 0;
+  double a;
+};
+
+extern void func(struct foo);
+
+void
+pass_foo(void)
+{
+  struct foo test;
+  test.a = 114;
+  func(test); // { dg-message "the ABI for passing a value containing 
zero-width fields before an adjacent 64-bit floating-point field was changed in 
GCC 12.1" }
+}
diff --git a/gcc/testsuite/gcc.target/mips/pr102024-2.c 
b/gcc/testsuite/gcc.target/mips/pr102024-2.c
new file mode 100644
index 000..8

[committed] libstdc++: Fix mismatched noexcept-specifiers in Filesystem TS

2022-04-01 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.

-- >8 --

The copy_file fix should have been part of r12-7063-gda72e0fd20f87b.

The path::begin() fix should have been part of r12-3930-gf2b7f56a15d9cb.
Thanks to Timm Bäder for reporting this one.

libstdc++-v3/ChangeLog:

* include/experimental/bits/fs_fwd.h (copy_file): Remove
incorrect noexcept from declaration.
* include/experimental/bits/fs_path.h (path::begin, path::end):
Add noexcept to declarations, to match definitions.
---
 libstdc++-v3/include/experimental/bits/fs_fwd.h  | 2 +-
 libstdc++-v3/include/experimental/bits/fs_path.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/fs_fwd.h 
b/libstdc++-v3/include/experimental/bits/fs_fwd.h
index d568e9e3a73..c8fbcbc1679 100644
--- a/libstdc++-v3/include/experimental/bits/fs_fwd.h
+++ b/libstdc++-v3/include/experimental/bits/fs_fwd.h
@@ -280,7 +280,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
 
   bool copy_file(const path& __from, const path& __to, copy_options __option);
   bool copy_file(const path& __from, const path& __to, copy_options __option,
-error_code&) noexcept;
+error_code&);
 
   path current_path();
 
diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h 
b/libstdc++-v3/include/experimental/bits/fs_path.h
index a050749676d..803df424664 100644
--- a/libstdc++-v3/include/experimental/bits/fs_path.h
+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
@@ -425,8 +425,8 @@ namespace __detail
 class iterator;
 typedef iterator const_iterator;
 
-iterator begin() const;
-iterator end() const;
+iterator begin() const noexcept;
+iterator end() const noexcept;
 
 /// @cond undocumented
 // Create a basic_string by reading until a null character.
-- 
2.34.1



Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-01 Thread Thomas Schwinge
Hi Tom!

On 2022-04-01T13:24:40+0200, Tom de Vries  wrote:
> When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on
> an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run
> into:
> ...
> FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 \
>   -DGOMP_NVPTX_JIT=-O0 execution test
> FAIL: libgomp.fortran/examples-4/declare_target-2.f90 -O0 \
>   -DGOMP_NVPTX_JIT=-O0 execution test
> ...
>
> Fix this by further limiting recursion depth in the test-cases for nvptx.
>
> Furthermore, make the recursion depth limiting nvptx-specific.

Careful:

> --- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
> +++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
> @@ -1,4 +1,16 @@
>  ! { dg-do run }
> +! { dg-additional-options "-cpp" }
> +! Reduced from 25 to 23, otherwise execution runs out of thread stack on
> +! Nvidia Titan V.
> +! Reduced from 23 to 22, otherwise execution runs out of thread stack on
> +! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
> +! Reduced from 22 to 20, otherwise execution runs out of thread stack on
> +! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
> +! { dg-additional-options "-DREC_DEPTH=20" { target { offload_target_nvptx } 
> } } */

'offload_target_nvptx' doesn't mean that offloading execution is done on
nvptx, but rather that we're "*compiling* for offload target nvptx"
(emphasis mine).  That means, with such a change we're now getting
different behavior in a system with an AMD GPU, when using a toolchain
that only has GCN offloading configured vs. a toolchain that has GCN and
nvptx offloading configured.  This isn't going to cause any real
problems, of course, but it's confusing, and a bad example of
'offload_target_nvptx'.

'offload_device_nvptx' ought to work: "using nvptx offload device".

But again, to keep things simple, I again suggest to unconditionally
reduce the recursion depth for all configurations, unless there exists an
actual rationale for the original value.


Grüße
 Thomas


> +
> +#ifndef REC_DEPTH
> +#define REC_DEPTH 25
> +#endif
>
>  module e_53_1_mod
>integer :: THRESHOLD = 20
> @@ -27,9 +39,5 @@ end module
>  program e_53_1
>use e_53_1_mod, only : fib, fib_wrapper
>if (fib (15) /= fib_wrapper (15)) stop 1
> -  ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
> -  ! Nvidia Titan V.
> -  ! Reduced from 23 to 22, otherwise execution runs out of thread stack on
> -  ! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
> -  if (fib (22) /= fib_wrapper (22)) stop 2
> +  if (fib (REC_DEPTH) /= fib_wrapper (REC_DEPTH)) stop 2
>  end program

> --- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
> +++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
> @@ -1,16 +1,24 @@
>  ! { dg-do run }
> +! { dg-additional-options "-cpp" }
> +! Reduced from 25 to 23, otherwise execution runs out of thread stack on
> +! Nvidia Titan V.
> +! Reduced from 23 to 22, otherwise execution runs out of thread stack on
> +! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
> +! Reduced from 22 to 18, otherwise execution runs out of thread stack on
> +! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
> +! { dg-additional-options "-DREC_DEPTH=18" { target { offload_target_nvptx } 
> } } */
> +
> +#ifndef REC_DEPTH
> +#define REC_DEPTH 25
> +#endif
>
>  program e_53_2
>!$omp declare target (fib)
>integer :: x, fib
>!$omp target map(from: x)
> -! Reduced from 25 to 23, otherwise execution runs out of thread stack on
> -! Nvidia Titan V.
> -! Reduced from 23 to 22, otherwise execution runs out of thread stack on
> -! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
> -x = fib (22)
> +x = fib (REC_DEPTH)
>!$omp end target
> -  if (x /= fib (22)) stop 1
> +  if (x /= fib (REC_DEPTH)) stop 1
>  end program
>
>  integer recursive function fib (n) result (f)
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH v2] Ignore zero width fields in arguments and issue -Wpsabi warning about C zero-width field ABI changes [PR102024]

2022-04-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Apr 01, 2022 at 08:11:43PM +0800, Xi Ruoyao wrote:
> v1 -> v2:
> 
> * "int has_zero_width_bf_abi_change" -> "bool
> zero_width_field_abi_change".  "int" -> "bool" because it's only 0/1,
> "bf" -> "field" because the change also affects zero-length arrays and
> empty structs/unions, etc.
> 
> * Add tests with zero-length array and empty struct.
> 
> * Coding style fix.
> 
> * "#zero_width_bitfields" -> "#mips_zero_width_fields" because this is
> not the exactly same change documented by #zero_width_bitfields.  I'll
> send a wwwdoc patch after this is approved.
> 
> gcc/
>   PR target/102024
>   * mips.cc (mips_function_arg): Ignore zero-width fields, and
>   inform if it causes a psABI change.
> 
> gcc/testsuite/
>   PR target/102024
>   * gcc.target/mips/pr102024-1.c: New test.
>   * gcc.target/mips/pr102024-2.c: New test.
>   * gcc.target/mips/pr102024-3.c: New test.

LGTM, thanks.

Jakub



Re: [PATCH] libstdc++: Implement std::unreachable() for C++23 (P0627R6)

2022-04-01 Thread Jonathan Wakely via Gcc-patches
On Fri, 1 Apr 2022 at 12:56, Matthias Kretz wrote:
>
> On Friday, 1 April 2022 13:33:42 CEST Jonathan Wakely wrote:
> > Matthias didn't like my Princess Bride easter egg :-)
> > Would the attached be better?
>
> LGTM.

OK, thanks to everybody who commented. I've pushed that to trunk now.



Re: [PATCH] JIT: Update docs v2

2022-04-01 Thread David Malcolm via Gcc-patches
On Sat, 2022-02-19 at 15:34 +, Petter Tomner via Gcc-patches wrote:
> Please disregard the prior patch I sent some minute ago. I spotted an
> copy-paste error, corrected bellow:
> 
> From 7f1d849319318a4cfd304279840899f928f9b86d Mon Sep 17 00:00:00 2001
> From: Petter Tomner 
> Date: Sat, 19 Feb 2022 16:01:54 +0100
> Subject: [PATCH] jit: Update docs
> 
> Update docs concerning linking and fix formatting errors. 'make html'
> looks fine.
> 
> /gcc/jit/docs/topics:
> * compatibility.rst: Add 19 tag
> * compilation.rst: Linking
> * contexts.rst: Linking example
> * expressions.rst: Fix formatting and dropped 's'
> 
> Signed-off-by:
> Petter Tomner   2022-02-19  
> ---
>  gcc/jit/docs/topics/compatibility.rst | 12 
>  gcc/jit/docs/topics/compilation.rst   |  8 ++--
>  gcc/jit/docs/topics/contexts.rst  |  5 +
>  gcc/jit/docs/topics/expressions.rst   | 15 ++-
>  4 files changed, 29 insertions(+), 11 deletions(-)


Thanks for these fixes, and sorry for the delayed in reviewing them.

I doublechecked them, and have pushed them to trunk as
r12-7958-gaed0f014781ee3.

Dave



Re: [RFC] ipa-cp: Feed results of IPA-CP into SCCVN

2022-04-01 Thread Martin Jambor
Hi,

thanks for a very quick reply.

On Fri, Apr 01 2022, Richard Biener wrote:
> On Fri, 1 Apr 2022, Martin Jambor wrote:
>
>> Hi,
>> 
>> PRs 68930 and 92497 show that when IPA-CP figures out constants in
>> aggregate parameters or when passed by reference but the loads happen
>> in an inlined function the information is lost.  This happens even
>> when the inlined function itself was known to have - or even cloned to
>> have - such constants in incoming parameters because the transform
>> phase of IPA passes is not run on them.  See discussion in the bugs
>> for reasons why.
>> 
>> Honza suggested that we can plug the results of IPA-CP analysis into
>> value numbering, so that FRE can figure out that some loads fetch
>> known constants.  This is what this patch attempts to do.
>> 
>> Although I spent quite some time reading tree-sccvn.c, it is complex
>> enough that I am sure I am not aware of various caveats and so I would
>> not be terribly surprised if there were some issues with my approach
>> that I am not aware of.  Nevertheless, it seems to work well for simple
>> cases and even passes bootstrap and testing (and LTO bootstrap) on
>> x86_64-linux.
>> 
>> I have experimented a little with using this approach instead of the
>> function walking parts of the IPA-CP transformation phase.  This would
>> mean that the known-constants would not participate in the passes after
>> IPA but before FRE - which are not many but there is a ccp and fwprop
>> pass among others.  For simple testcases like
>> gcc/testsuite/gcc.dg/ipa/ipcp-agg-*.c, it makes not assembly difference
>> at all.
>> 
>> What do you think?
>
> Comments below
>
>> Martin
>> 
>> 
>> gcc/ChangeLog:
>> 
>> 2022-03-30  Martin Jambor  
>> 
>>  PR ipa/68930
>>  PR ipa/92497
>>  * ipa-prop.cc (ipcp_get_aggregate_const): New function.
>>  (ipcp_transform_function): Do not deallocate transformation info.
>>  * ipa-prop.h (ipcp_get_aggregate_const): Declare.
>>  * tree-ssa-sccvn.cc: Include alloc-pool.h, symbol-summary.h and
>>  ipa-prop.h.
>>  (vn_reference_lookup_2): When hitting default-def vuse, query
>>  IPA-CP transformation info for any known constants.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 2022-03-30  Martin Jambor  
>> 
>>  PR ipa/68930
>>  PR ipa/92497
>>  * gcc.dg/ipa/pr92497-1.c: New test.
>>  * gcc.dg/ipa/pr92497-2.c: Likewise.
>> ---
>>  gcc/ipa-prop.cc  | 43 
>>  gcc/ipa-prop.h   |  2 ++
>>  gcc/testsuite/gcc.dg/ipa/pr92497-1.c | 26 +
>>  gcc/testsuite/gcc.dg/ipa/pr92497-2.c | 26 +
>>  gcc/tree-ssa-sccvn.cc| 35 +-
>>  5 files changed, 126 insertions(+), 6 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-2.c
>> 
>> diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
>> index e55fe2776f2..a73a5d9ec1d 100644
>> --- a/gcc/ipa-prop.cc
>> +++ b/gcc/ipa-prop.cc
>> @@ -5748,6 +5748,44 @@ ipcp_modif_dom_walker::before_dom_children 
>> (basic_block bb)
>>return NULL;
>>  }
>>  
>> +/* If IPA-CP discovered a constant in parameter PARM at OFFSET of a given 
>> SIZE
>> +   - whether passed by reference or not is given by BY_REF - return that
>> +   constant.  Otherwise return NULL_TREE.  */
>> +
>> +tree
>> +ipcp_get_aggregate_const (tree parm, bool by_ref,
>> +  HOST_WIDE_INT offset, HOST_WIDE_INT size)
>
> I'd prefer to pass in the function decl or struct function or
> cgraph node.

OK.

>
>> +{
>> +  cgraph_node *cnode = cgraph_node::get (current_function_decl);
>> +
>> +  ipa_agg_replacement_value *aggval = ipa_get_agg_replacements_for_node 
>> (cnode);
>> +  if (!aggval)
>> +return NULL_TREE;
>> +
>> +  int index = 0;
>> +  for (tree p = DECL_ARGUMENTS (current_function_decl);
>> +   p != parm; p = DECL_CHAIN (p))
>> +{
>> +  index++;
>> +  if (!p)
>> +return NULL_TREE;
>> +}
>> +
>> +  ipa_agg_replacement_value *v;
>> +  for (v = aggval; v; v = v->next)
>> +if (v->index == index
>> +&& v->offset == offset)
>> +  break;
>> +  if (!v
>> +  || v->by_ref != by_ref
>> +  || maybe_ne (tree_to_poly_int64 (TYPE_SIZE (TREE_TYPE (v->value))),
>> +   size))
>> +return NULL_TREE;
>
> two linear searches here - ugh.  I wonder if we should instead
> pre-fill a hash-map from PARM_DECL to a ipa_agg_replacement_value *
> vector sorted by offset which we can binary search?  That could be
> done once when starting value-numbering (not on regions).  Is
> there any reason the data structure is as it is?

Only that it is usually a very short list.  It is bounded by
param_ipa_cp_value_list_size (8 by default) times the number of
arguments and of course only few usually have any constants in them.

Having said that, changing the structure is something I am looking into
also for other reasons and I am very much opened to n

[committed] jit: further doc fixes

2022-04-01 Thread David Malcolm via Gcc-patches
Further jit doc fixes, which fix links to
gcc_jit_function_type_get_param_type and gcc_jit_struct_get_field.

I also regenerated libgccjit.texi (not included in the diff below).

Tested with "make html" and with a bootstrap.
Committed to trunk as r12-7959-g1a172da8a3f362.

gcc/jit/ChangeLog:
* docs/topics/expressions.rst: Fix formatting.
* docs/topics/types.rst: Likewise.
* docs/_build/texinfo/libgccjit.texi: Regenerate

Signed-off-by: David Malcolm 
---
 gcc/jit/docs/topics/expressions.rst | 8 
 gcc/jit/docs/topics/types.rst   | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/jit/docs/topics/expressions.rst 
b/gcc/jit/docs/topics/expressions.rst
index 9267b6d2ad6..d51264af73f 100644
--- a/gcc/jit/docs/topics/expressions.rst
+++ b/gcc/jit/docs/topics/expressions.rst
@@ -24,7 +24,7 @@ Rvalues
 ---
 .. type:: gcc_jit_rvalue
 
-A :c:type:`gcc_jit_rvalue *` is an expression that can be computed.
+A :c:type:`gcc_jit_rvalue` is an expression that can be computed.
 
 It can be simple, e.g.:
 
@@ -602,7 +602,7 @@ Function calls
   gcc_jit_rvalue_set_bool_require_tail_call (gcc_jit_rvalue *call,\
  int require_tail_call)
 
-   Given an :c:type:`gcc_jit_rvalue *` for a call created through
+   Given an :c:type:`gcc_jit_rvalue` for a call created through
:c:func:`gcc_jit_context_new_call` or
:c:func:`gcc_jit_context_new_call_through_ptr`, mark/clear the
call as needing tail-call optimization.  The optimizer will
@@ -721,8 +721,8 @@ where the rvalue is computed by reading from the storage 
area.
 
   #ifdef LIBGCCJIT_HAVE_gcc_jit_lvalue_set_tls_model
 
-.. function:: void
-  gcc_jit_lvalue_set_link_section (gcc_jit_lvalue *lvalue,
+.. function:: void\
+  gcc_jit_lvalue_set_link_section (gcc_jit_lvalue *lvalue,\
const char *section_name)
 
Set the link section of a variable.
diff --git a/gcc/jit/docs/topics/types.rst b/gcc/jit/docs/topics/types.rst
index 9779ad26b6f..c2082c0ef3e 100644
--- a/gcc/jit/docs/topics/types.rst
+++ b/gcc/jit/docs/topics/types.rst
@@ -192,7 +192,7 @@ A compound type analagous to a C `struct`.
 
 A field within a :c:type:`gcc_jit_struct`.
 
-You can model C `struct` types by creating :c:type:`gcc_jit_struct *` and
+You can model C `struct` types by creating :c:type:`gcc_jit_struct` and
 :c:type:`gcc_jit_field` instances, in either order:
 
 * by creating the fields, then the structure.  For example, to model:
@@ -375,7 +375,7 @@ Reflection API
  Given a function type, return its number of parameters.
 
 .. function::  gcc_jit_type *\
-   gcc_jit_function_type_get_param_type (gcc_jit_function_type 
*function_type,
+   gcc_jit_function_type_get_param_type (gcc_jit_function_type 
*function_type,\
  size_t index)
 
  Given a function type, return the type of the specified parameter.
@@ -417,7 +417,7 @@ Reflection API
  alignment qualifiers.
 
 .. function::  gcc_jit_field *\
-   gcc_jit_struct_get_field (gcc_jit_struct *struct_type,
+   gcc_jit_struct_get_field (gcc_jit_struct *struct_type,\
  size_t index)
 
  Get a struct field by index.
-- 
2.26.3



[wwwdocs] gcc-12: jit changes

2022-04-01 Thread David Malcolm via Gcc-patches
I've gone ahead and committed the following change to the GCC 12
release notes.

---
 htdocs/gcc-12/changes.html | 35 ++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 689feeba..f1c36258 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -380,7 +380,40 @@ a work-in-progress.
 
 
 
-
+libgccjit
+
+
+  The libgccjit API gained 23 new entry points:
+
+  17 new "reflection" entrypoints for querying functions and types (https://gcc.gnu.org/onlinedocs/jit/topics/compatibility.html#libgccjit-abi-16";>LIBGCCJIT_ABI_16)
+  
+  
+   https://gcc.gnu.org/onlinedocs/jit/topics/expressions.html#c.gcc_jit_lvalue_set_tls_model";>gcc_jit_lvalue_set_tls_model
+   for supporting thread-local variables
+   (https://gcc.gnu.org/onlinedocs/jit/topics/compatibility.html#libgccjit-abi-17";>LIBGCCJIT_ABI_17)
+  
+  
+   https://gcc.gnu.org/onlinedocs/jit/topics/expressions.html#c.gcc_jit_lvalue_set_link_section";>gcc_jit_lvalue_set_link_section
+   for setting the link section of global variables, analogous to
+   https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-section-variable-attribute";>__attribute__((section(".section")))
+   (https://gcc.gnu.org/onlinedocs/jit/topics/compatibility.html#libgccjit-abi-18";>LIBGCCJIT_ABI_18)
+  
+  4 new entrypoints for initializing global variables and creating
+   constructors for rvalues
+   (https://gcc.gnu.org/onlinedocs/jit/topics/compatibility.html#libgccjit-abi-19";>LIBGCCJIT_ABI_19)
+  
+
+  
+  libgccjit has gained support for the use of various atomic builtins
+(https://gcc.gnu.org/PR96066";>PR96066,
+https://gcc.gnu.org/PR96067";>PR96067)
+  
+  https://gcc.gnu.org/onlinedocs/jit/topics/expressions.html#c.gcc_jit_context_new_cast";>gcc_jit_context_new_cast
+is now able to handle truncation and extension between different
+integer types
+(https://gcc.gnu.org/PR95498";>PR95498)
+  
+
 
 
 New Targets and Target Specific Improvements
-- 
2.30.2



Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]

2022-04-01 Thread Maciej W. Rozycki
On Sat, 12 Mar 2022, Xi Ruoyao via Gcc-patches wrote:

> I'm now thinking: is there always at least one *GPR* which need to be
> cleared?  If it's true, let's say GPR $12, and fcc0 & fcc2 needs to be
> cleared, we can use something like:
> 
> cfc1 $12, $25
> andi $25, 5
> ctc1 $12, $25
> move $12, $0

 There's always $1 ($at) available and we're in a function's epilogue, so 
there should be plenty of dead temporaries available as well.  For legacy 
ISAs you'd need to use the FCSR instead ($31) and two temporaries would be 
required as the condition code bits are located in the upper half.

 FWIW,

  Maciej


[wwwdocs] gcc-12: linkify various options

2022-04-01 Thread David Malcolm via Gcc-patches
I've committed the following patch to the GCC 12 release notes.

---
 htdocs/gcc-12/changes.html | 47 --
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index f1c36258..5619acff 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -172,17 +172,20 @@ a work-in-progress.
   the clang language extension was added.
   New warnings:
 
-  -Wbidi-chars warns about potentially misleading UTF-8
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wbidi-chars";>-Wbidi-chars
+   warns about potentially misleading UTF-8
bidirectional control characters.  The default is
-Wbidi-chars=unpaired
(https://gcc.gnu.org/PR103026";>PR103026)
-  -Warray-compare warns about comparisons between two 
operands of
- array type (https://gcc.gnu.org/PR97573";>PR97573)
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Warray-compare";>-Warray-compare
+   warns about comparisons between two operands of
+   array type (https://gcc.gnu.org/PR97573";>PR97573)
 
   
   Enhancements to existing warnings:
 
-  -Wattributes has been extended so that it's
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wattributes";>-Wattributes
+   has been extended so that it's
possible to use -Wno-attributes=ns::attr or
-Wno-attributes=ns:: to suppress warnings about unknown 
scoped
attributes (in C++11 and C2X).  Similarly,
@@ -208,7 +211,8 @@ a work-in-progress.
 The #elifdef and #elifndef
 preprocessing directives are now supported.
 The printf and scanf format checking
-with -Wformat now supports the %b format
+  with https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wformat";>-Wformat
+  now supports the %b format
 specified by C2X for binary integers, and the %B
 format recommended by C2X for printf.
   
@@ -263,10 +267,12 @@ a work-in-progress.
   (https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=87c2080b";>git)
   Deduction guides can be declared at class scope
   (https://gcc.gnu.org/PR79501";>PR79501)
-  -Wuninitialized warns about using uninitialized variables in
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wuninitialized";>-Wuninitialized
+warns about using uninitialized variables in
   member initializer lists (https://gcc.gnu.org/PR19808";>PR19808)
   
-  -Wint-in-bool-context is now disabled when instantiating
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wint-in-bool-context";>-Wint-in-bool-context
+is now disabled when instantiating
   a template (https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=3a2b12bc";>git)
   Stricter checking of attributes on friend declarations: if a friend
   declaration has an attribute, that declaration must be a definition.
@@ -279,13 +285,15 @@ a work-in-progress.
   and -Wc++23-extensions.  They are enabled by default
   and can be used to control existing pedwarns about occurences of
   new C++ constructs in code using an old C++ standard dialect.
-  New warning -Wmissing-requires warns about missing
-  requires
+  New warning
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmissing-requires";>-Wmissing-requires
+  warns about missing requires
   (https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=e18e56c7";>git)
   The existing std::is_constant_evaluated in if
   warning was extended to warn in more cases
   (https://gcc.gnu.org/PR100995";>PR100995)
-  -Waddress has been enhanced so that it now warns about, for
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Waddress";>-Waddress
+  has been enhanced so that it now warns about, for
   instance, comparing the address of a nonstatic member function to null
   (https://gcc.gnu.org/PR102103";>PR102103)
   Errors about narrowing are no longer hidden if they occur in system
@@ -307,7 +315,9 @@ a work-in-progress.
   constinit thread_local variables are optimized better
   (https://gcc.gnu.org/PR101786";>PR101786)
   Support for C++17 
std::hardware_destructive_interference_size
-  was added, along with the -Winterference-size warning
+  was added, along with the
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Winterference-size";>-Winterference-size
+  warning
   (https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=76b75018";>git)
   Many bugs in the CTAD handling have been fixed
   (https://gcc.gnu.org/PR101344";>PR101344,
@@ -618,16 +628,19 @@ a work-in-progress.
 Eliminating uninitialized variables
 
 
-  GCC can now initialize all stack variables implicitly, including
+  GCC can now https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-ftrivial-auto-var-init";>initialize
 all stack variables implicitly, including
   padding. This is

Re: [PATCH v2] rs6000: Support UN[GL][ET] in rs6000_maybe_emit_maxc_minc [PR105002]

2022-04-01 Thread Segher Boessenkool
Hi!

On Fri, Apr 01, 2022 at 02:27:14PM +0800, Kewen.Lin wrote:
> Commit r12-7687 exposed one miss optimization chance in function
> rs6000_maybe_emit_maxc_minc, for now it only considers comparison
> codes GE/GT/LE/LT, but it can support more variants with codes
> UNLT/UNLE/UNGT/UNGE by reversing them into the equivalent ones
> with GE/GT/LE/LT.

You may want to add somewhere (in the comment in the code perhaps?)
that if we see e.g. UNLT it guarantees that we have 4-way condition
codes (LT/GT/EQ/UN), so we do not have to check for fast-math or the
like.  This is always true of course, but it doesn't hurt to remind
the reader :-)

The PR marker goes here:

PR target/105002
>   * config/rs6000/rs6000.cc (rs6000_maybe_emit_maxc_minc): Support more
>   comparison codes UNLT/UNLE/UNGT/UNGE.

> -  bool max_p = false;
> +  bool max_p;

Please move this to later, since you touch it anyway:

  bool max_p;
>if (code == GE || code == GT)
>  max_p = true;
>else if (code == LE || code == LT)

Okay for trunk with those finishing touches.  Thanks!


Segher


[PATCH] c++: implicit guides should inherit class constraints [PR104873]

2022-04-01 Thread Patrick Palka via Gcc-patches
An implicit guide already inherits the (rewritten) constraints of the
constructor.  Thus it seems natural that the guide must also inherit
the constraints of the class template, since a constructor's constraints
might assume the class's constraints are satisfied, and therefore
checking these two sets of constraints "out of order" may result in hard
errors as in the first testcase below.

This patch makes implicit guides inherit the constraints of the class
template (even for unconstrained constructors, and even for the copy
deduction candidate).

In passing, this patch gives implicit guides a trailing return type
since that's how they're depicted in the standard (e.g.
[over.match.class.deduct]/6); this changes the order of substitution
into implicit guides in a probably negligible way, especially now that
they inherit the class constraints.

The parameter_mapping_equivalent_p change is to avoid an ICE in the last
testcase below (described within), reduced from a cmcstl2 testsuite ICE.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look like
the right approach?

PR c++/104873

gcc/cp/ChangeLog:

* constraint.cc (parameter_mapping_equivalent_p): Relax assert
to expect equivalence not identity of template parameters.
* pt.cc (build_deduction_guide): Propagate the class's
constraints to the deduction guide.  Set TYPE_HAS_LATE_RETURN_TYPE
on the function type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ctad5.C: New test.
* g++.dg/cpp2a/concepts-ctad6.C: New test.
* g++.dg/cpp2a/concepts-ctad6a.C: New test.
* g++.dg/cpp2a/concepts-ctad7.C: New test.
---
 gcc/cp/constraint.cc |  2 +-
 gcc/cp/pt.cc | 26 ++
 gcc/testsuite/g++.dg/cpp2a/concepts-ctad5.C  | 29 
 gcc/testsuite/g++.dg/cpp2a/concepts-ctad6.C  | 19 +
 gcc/testsuite/g++.dg/cpp2a/concepts-ctad6a.C | 19 +
 gcc/testsuite/g++.dg/cpp2a/concepts-ctad7.C  | 26 ++
 6 files changed, 120 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ctad5.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ctad6.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ctad6a.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ctad7.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 94f6222b436..6cbb182dda2 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -604,7 +604,7 @@ parameter_mapping_equivalent_p (tree t1, tree t2)
   tree map2 = ATOMIC_CONSTR_MAP (t2);
   while (map1 && map2)
 {
-  gcc_checking_assert (TREE_VALUE (map1) == TREE_VALUE (map2));
+  gcc_checking_assert (cp_tree_equal (TREE_VALUE (map1), TREE_VALUE 
(map2)));
   tree arg1 = TREE_PURPOSE (map1);
   tree arg2 = TREE_PURPOSE (map2);
   if (!template_args_equal (arg1, arg2))
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 75ed9a34018..966e6d90d3a 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -29261,6 +29261,10 @@ build_deduction_guide (tree type, tree ctor, tree 
outer_args, tsubst_flags_t com
   /* Discard the 'this' parameter.  */
   fparms = FUNCTION_ARG_CHAIN (ctor);
   fargs = TREE_CHAIN (DECL_ARGUMENTS (ctor));
+  /* The guide's constraints consist of the class template's constraints
+followed by the constructor's rewritten constraints.  We start
+with the constructor's constraints (since we need to rewrite them),
+and prepend the class template's constraints later.  */
   ci = get_constraints (ctor);
   loc = DECL_SOURCE_LOCATION (ctor);
   explicit_p = DECL_NONCONVERTING_P (ctor);
@@ -29362,6 +29366,27 @@ build_deduction_guide (tree type, tree ctor, tree 
outer_args, tsubst_flags_t com
return error_mark_node;
 }
 
+  /* Prepend the class template's constraints to the constructor's rewritten
+ constraints (if any).  */
+  if (tree class_ci = get_constraints (CLASSTYPE_TI_TEMPLATE (type)))
+{
+  if (outer_args)
+   {
+ /* FIXME: As above.  */
+ ++processing_template_decl;
+ class_ci = tsubst_constraint_info (class_ci, outer_args,
+complain, ctor);
+ --processing_template_decl;
+   }
+  if (ci)
+   ci = build_constraints (combine_constraint_expressions
+   (CI_TEMPLATE_REQS (class_ci),
+CI_TEMPLATE_REQS (ci)),
+   CI_DECLARATOR_REQS (ci));
+  else
+   ci = copy_node (class_ci);
+}
+
   if (!memtmpl)
 {
   /* Copy the parms so we can set DECL_PRIMARY_TEMPLATE.  */
@@ -29371,6 +29396,7 @@ build_deduction_guide (tree type, tree ctor, tree 
outer_args, tsubst_flags_t com
 }
 
   tree fntype = build_function_type (type, fparms);
+  TYPE_HAS_LATE_RETURN_TYPE (fntype) = true;
   tree ded_fn = build_lang_decl_lo

[PATCH 0/2] avr: Add support AVR-DA and DB series devices

2022-04-01 Thread Joel Holdsworth via Gcc-patches
In 2021, Microchip launched two new series of AVR microcontrollers:
AVR-DA and AVR-DB. This patch-set contains patches to add support for
the full set of both series of devices, by listing the memory layouts in
avr-mcus.def.

There is an open GitHub Pull Request to add support for these devices to
avr-libc here: https://github.com/avrdudes/avr-libc/pull/881

In addition, this patch-set includes a patch to remove non-printable
characters from avr-devices.cc.

Joel Holdsworth (2):
  avr: Added AVR-DA and DB MCU series
  avr: Removed errant control characters

 gcc/config/avr/avr-devices.cc|  2 --
 gcc/config/avr/avr-mcus.def  | 22 ++
 gcc/config/avr/gen-avr-mmcu-specs.cc |  2 +-
 gcc/config/avr/gen-avr-mmcu-texi.cc  |  2 +-
 gcc/doc/avr-mmcu.texi|  6 +++---
 5 files changed, 27 insertions(+), 7 deletions(-)

-- 
2.35.GIT



[PATCH 1/2] avr: Added AVR-DA and DB MCU series

2022-04-01 Thread Joel Holdsworth via Gcc-patches
gcc/
* config/avr/avr-mcus.def: Add device definitions.
* doc/avr-mmcu.texi: Corresponding changes.
* gcc/config/avr/gen-avr-mmcu-texi.c: Added support for avr
  device prefix.
* gcc/config/avr/gen-avr-mmcu-specs.c: Prevent -mmcu=avr* flags
  from leaking into cc1.

Signed-off-by: Joel Holdsworth 
---
 gcc/config/avr/avr-mcus.def  | 22 ++
 gcc/config/avr/gen-avr-mmcu-specs.cc |  2 +-
 gcc/config/avr/gen-avr-mmcu-texi.cc  |  2 +-
 gcc/doc/avr-mmcu.texi|  6 +++---
 4 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/gcc/config/avr/avr-mcus.def b/gcc/config/avr/avr-mcus.def
index 1e12ab30170..fa5e6685227 100644
--- a/gcc/config/avr/avr-mcus.def
+++ b/gcc/config/avr/avr-mcus.def
@@ -306,6 +306,14 @@ AVR_MCU ("atxmega16c4",  ARCH_AVRXMEGA2, AVR_ISA_RMW,  
"__AVR_ATxmega16C4__"
 AVR_MCU ("atxmega32a4u", ARCH_AVRXMEGA2, AVR_ISA_RMW,  
"__AVR_ATxmega32A4U__", 0x2000, 0x0, 0x9000, 0)
 AVR_MCU ("atxmega32c4",  ARCH_AVRXMEGA2, AVR_ISA_RMW,  
"__AVR_ATxmega32C4__",  0x2000, 0x0, 0x9000, 0)
 AVR_MCU ("atxmega32e5",  ARCH_AVRXMEGA2, AVR_ISA_NONE, 
"__AVR_ATxmega32E5__",  0x2000, 0x0, 0x9000, 0)
+AVR_MCU ("avr64da28",ARCH_AVRXMEGA2, AVR_ISA_NONE, 
"__AVR_AVR64DA28__",0x6000, 0x0, 0x8000, 0x1)
+AVR_MCU ("avr64da32",ARCH_AVRXMEGA2, AVR_ISA_NONE, 
"__AVR_AVR64DA32__",0x6000, 0x0, 0x8000, 0x1)
+AVR_MCU ("avr64da48",ARCH_AVRXMEGA2, AVR_ISA_NONE, 
"__AVR_AVR64DA48__",0x6000, 0x0, 0x8000, 0x1)
+AVR_MCU ("avr64da64",ARCH_AVRXMEGA2, AVR_ISA_NONE, 
"__AVR_AVR64DA64__",0x6000, 0x0, 0x8000, 0x1)
+AVR_MCU ("avr64db28",ARCH_AVRXMEGA2, AVR_ISA_NONE, 
"__AVR_AVR64DB28__",0x6000, 0x0, 0x8000, 0x1)
+AVR_MCU ("avr64db32",ARCH_AVRXMEGA2, AVR_ISA_NONE, 
"__AVR_AVR64DB32__",0x6000, 0x0, 0x8000, 0x1)
+AVR_MCU ("avr64db48",ARCH_AVRXMEGA2, AVR_ISA_NONE, 
"__AVR_AVR64DB48__",0x6000, 0x0, 0x8000, 0x1)
+AVR_MCU ("avr64db64",ARCH_AVRXMEGA2, AVR_ISA_NONE, 
"__AVR_AVR64DB64__",0x6000, 0x0, 0x8000, 0x1)
 /* Xmega, Flash + RAM < 64K, flash visible in RAM address space */
 AVR_MCU ("avrxmega3",ARCH_AVRXMEGA3, AVR_ISA_NONE,  NULL,  
0x3f00, 0x0, 0x8000, 0)
 AVR_MCU ("attiny202",ARCH_AVRXMEGA3, AVR_ISA_RCALL, 
"__AVR_ATtiny202__",   0x3f80, 0x0, 0x800,  0x8000)
@@ -342,6 +350,12 @@ AVR_MCU ("atmega3208",   ARCH_AVRXMEGA3, AVR_ISA_NONE, 
 "__AVR_ATmega3208__"
 AVR_MCU ("atmega3209",   ARCH_AVRXMEGA3, AVR_ISA_NONE,  
"__AVR_ATmega3209__",  0x3800, 0x0, 0x8000, 0x4000)
 AVR_MCU ("atmega4808",   ARCH_AVRXMEGA3, AVR_ISA_NONE,  
"__AVR_ATmega4808__",  0x2800, 0x0, 0xc000, 0x4000)
 AVR_MCU ("atmega4809",   ARCH_AVRXMEGA3, AVR_ISA_NONE,  
"__AVR_ATmega4809__",  0x2800, 0x0, 0xc000, 0x4000)
+AVR_MCU ("avr32da28",ARCH_AVRXMEGA3, AVR_ISA_NONE,  
"__AVR_AVR32DA28__",   0x7000, 0x0, 0x8000, 0x8000)
+AVR_MCU ("avr32da32",ARCH_AVRXMEGA3, AVR_ISA_NONE,  
"__AVR_AVR32DA32__",   0x7000, 0x0, 0x8000, 0x8000)
+AVR_MCU ("avr32da48",ARCH_AVRXMEGA3, AVR_ISA_NONE,  
"__AVR_AVR32DA48__",   0x7000, 0x0, 0x8000, 0x8000)
+AVR_MCU ("avr32db28",ARCH_AVRXMEGA3, AVR_ISA_NONE,  
"__AVR_AVR32DB28__",   0x7000, 0x0, 0x8000, 0x8000)
+AVR_MCU ("avr32db32",ARCH_AVRXMEGA3, AVR_ISA_NONE,  
"__AVR_AVR32DB32__",   0x7000, 0x0, 0x8000, 0x8000)
+AVR_MCU ("avr32db48",ARCH_AVRXMEGA3, AVR_ISA_NONE,  
"__AVR_AVR32DB48__",   0x7000, 0x0, 0x8000, 0x8000)
 /* Xmega, 64K < Flash <= 128K, RAM <= 64K */
 AVR_MCU ("avrxmega4",ARCH_AVRXMEGA4, AVR_ISA_NONE, NULL,   
0x2000, 0x0, 0x11000, 0)
 AVR_MCU ("atxmega64a3",  ARCH_AVRXMEGA4, AVR_ISA_NONE, 
"__AVR_ATxmega64A3__",  0x2000, 0x0, 0x11000, 0)
@@ -352,6 +366,14 @@ AVR_MCU ("atxmega64b1",  ARCH_AVRXMEGA4, AVR_ISA_RMW,  
"__AVR_ATxmega64B1__"
 AVR_MCU ("atxmega64b3",  ARCH_AVRXMEGA4, AVR_ISA_RMW,  
"__AVR_ATxmega64B3__",  0x2000, 0x0, 0x11000, 0)
 AVR_MCU ("atxmega64c3",  ARCH_AVRXMEGA4, AVR_ISA_RMW,  
"__AVR_ATxmega64C3__",  0x2000, 0x0, 0x11000, 0)
 AVR_MCU ("atxmega64d4",  ARCH_AVRXMEGA4, AVR_ISA_NONE, 
"__AVR_ATxmega64D4__",  0x2000, 0x0, 0x11000, 0)
+AVR_MCU ("avr128da28",   ARCH_AVRXMEGA4, AVR_ISA_NONE, 
"__AVR_AVR128DA28__",   0x4000, 0x0, 0x8000,  0x2)
+AVR_MCU ("avr128da32",   ARCH_AVRXMEGA4, AVR_ISA_NONE, 
"__AVR_AVR128DA32__",   0x4000, 0x0, 0x8000,  0x2)
+AVR_MCU ("avr128da48",   ARCH_AVRXMEGA4, AVR_ISA_NONE, 
"__AVR_AVR128DA48__",   0x4000, 0x0, 0x8000,  0x2)
+AVR_MCU ("avr128da64",   ARCH_AVRXMEGA4, AVR_ISA_NONE, 
"__AVR_AVR128DA64__",   0x4000, 0x0, 0x8000,  0x2)
+AVR_MCU ("avr128db28",   ARCH_AVRXMEGA4, AVR_ISA_NONE, 
"__AVR_AVR128DB28__",   0x4000, 0x0, 0x8000,  0x2)
+AVR_MCU ("avr128db32",   ARCH_AVRXMEGA4, AVR_ISA_NONE, 
"__AVR_AVR128DB32__",   0x4000, 0x0, 0x8000,  0x2)
+AVR_MCU ("avr128db48",   ARCH_AVRXMEGA4, 

[PATCH 2/2] avr: Removed errant control characters

2022-04-01 Thread Joel Holdsworth via Gcc-patches
Signed-off-by: Joel Holdsworth 
---
 gcc/config/avr/avr-devices.cc | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/config/avr/avr-devices.cc b/gcc/config/avr/avr-devices.cc
index aa284217f50..ff6a5441b77 100644
--- a/gcc/config/avr/avr-devices.cc
+++ b/gcc/config/avr/avr-devices.cc
@@ -126,8 +126,6 @@ avr_mcu_types[] =
 };
 
 
-
-
 #ifndef IN_GEN_AVR_MMCU_TEXI
 
 static char*
-- 
2.35.GIT



Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-01 Thread Tom de Vries via Gcc-patches

On 4/1/22 14:28, Thomas Schwinge wrote:

Hi Tom!

On 2022-04-01T13:24:40+0200, Tom de Vries  wrote:

When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on
an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run
into:
...
FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 \
   -DGOMP_NVPTX_JIT=-O0 execution test
FAIL: libgomp.fortran/examples-4/declare_target-2.f90 -O0 \
   -DGOMP_NVPTX_JIT=-O0 execution test
...

Fix this by further limiting recursion depth in the test-cases for nvptx.

Furthermore, make the recursion depth limiting nvptx-specific.


Careful:


--- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
@@ -1,4 +1,16 @@
  ! { dg-do run }
+! { dg-additional-options "-cpp" }
+! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+! Nvidia Titan V.
+! Reduced from 23 to 22, otherwise execution runs out of thread stack on
+! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
+! Reduced from 22 to 20, otherwise execution runs out of thread stack on
+! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
+! { dg-additional-options "-DREC_DEPTH=20" { target { offload_target_nvptx } } 
} */


'offload_target_nvptx' doesn't mean that offloading execution is done on
nvptx, but rather that we're "*compiling* for offload target nvptx"
(emphasis mine).  That means, with such a change we're now getting
different behavior in a system with an AMD GPU, when using a toolchain
that only has GCN offloading configured vs. a toolchain that has GCN and
nvptx offloading configured.  This isn't going to cause any real
problems, of course, but it's confusing, and a bad example of
'offload_target_nvptx'.

'offload_device_nvptx' ought to work: "using nvptx offload device".



Thanks for pointing that out.

I tried to understand this multiple offloading configuration a bit, and 
came up with the following mental model: it's possible to have a host 
with say an nvptx and amd offloading device, and then configure and 
build a toolchain that can generate a single executable that can offload 
to either device, depending on the value of appropriate openacc/openmp 
environment variables.


So, in principle the libgomp testsuite could have a mode in which it 
does that: run the same executable twice, once for each offloading 
device.  In that case, even using offload_device_nvptx would not be 
accurate enough, and we'd need to test for offload device type at 
runtime, as used to be done in 
libgomp/testsuite/libgomp.fortran/task-detach-6.f90.


I've tried to copy that setup to 
libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90, but 
that doesn't seem to work anymore.  I've also tried copying that 
test-case to 
libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90 to rule 
out any subdir-related problems, but no luck there either.


Attached is that copy approach, could you try it out and see if it works 
for you?


Do you perhaps have an idea why it's failing?

I can make a patch using offload_device_nvptx, but I'd prefer to 
understand first why the approach above isn't working.


Thanks,
- Tom[libgomp/testsuite] Add libgomp.fortran/copy-of-declare_target-1.f90

---
 .../libgomp.fortran/copy-of-declare_target-1.f90   | 49 ++
 1 file changed, 49 insertions(+)

diff --git a/libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90 b/libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90
new file mode 100644
index 000..6dcf5312070
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90
@@ -0,0 +1,49 @@
+! { dg-do run }
+! { dg-additional-sources on_device_arch.c }
+
+module e_53_1_mod
+  integer :: THRESHOLD = 20
+contains
+  integer recursive function fib (n) result (f)
+!$omp declare target
+integer :: n
+if (n <= 0) then
+  f = 0
+else if (n == 1) then
+  f = 1
+else
+  f = fib (n - 1) + fib (n - 2)
+end if
+  end function
+
+  integer function fib_wrapper (n)
+integer :: x
+!$omp target map(to: n) map(from: x) if(n > THRESHOLD)
+  x = fib (n)
+!$omp end target
+fib_wrapper = x
+  end function
+end module
+
+program e_53_1
+  use e_53_1_mod, only : fib, fib_wrapper
+  integer :: REC_DEPTH = 25
+
+  interface
+integer function on_device_arch_nvptx() bind(C)
+end function on_device_arch_nvptx
+  end interface
+
+  if (on_device_arch_nvptx () /= 0) then
+ ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+ ! Nvidia Titan V.
+ ! Reduced from 23 to 22, otherwise execution runs out of thread stack on
+ ! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
+ ! Reduced from 22 to 20, otherwise execution runs out of thread stack on
+ ! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
+ REC_DEPTH = 20
+  end if
+
+  if (fib (15

Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Apr 01, 2022 at 05:34:50PM +0200, Tom de Vries wrote:
> Do you perhaps have an idea why it's failing?

Because you call on_device_arch_nvptx () outside of
!$omp target region, so unless the host device is NVPTX,
it will not be true.

> +program e_53_1
> +  use e_53_1_mod, only : fib, fib_wrapper
> +  integer :: REC_DEPTH = 25
> +
> +  interface
> +integer function on_device_arch_nvptx() bind(C)
> +end function on_device_arch_nvptx
> +  end interface
> +
> +  if (on_device_arch_nvptx () /= 0) then
> + ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
> + ! Nvidia Titan V.
> + ! Reduced from 23 to 22, otherwise execution runs out of thread stack on
> + ! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
> + ! Reduced from 22 to 20, otherwise execution runs out of thread stack on
> + ! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
> + REC_DEPTH = 20
> +  end if
> +
> +  if (fib (15) /= fib_wrapper (15)) stop 1
> +  if (fib (REC_DEPTH) /= fib_wrapper (REC_DEPTH)) stop 2
> +end program

Jakub



Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-01 Thread Tom de Vries via Gcc-patches

On 4/1/22 17:38, Jakub Jelinek wrote:

On Fri, Apr 01, 2022 at 05:34:50PM +0200, Tom de Vries wrote:

Do you perhaps have an idea why it's failing?


Because you call on_device_arch_nvptx () outside of
!$omp target region, so unless the host device is NVPTX,
it will not be true.



That bit does works because on_device_arch_nvptx calls on_device_arch 
which contains the omp target bit:

...
static int
on_device_arch (int d)
{
  int d_cur;
  #pragma omp target map(from:d_cur)
  d_cur = device_arch ();

  return d_cur == d;
}

int
on_device_arch_nvptx ()
{
  return on_device_arch (GOMP_DEVICE_NVIDIA_PTX);
}
...

So I realized that I didn't do a good job of specifying the problem I 
encountered, and went looking at it, at which point I realized the error 
message had changed, and knew how to fix it ... So, my apologies, some 
confusion on my part.


Anyway, attached patch avoids any nvptx-related tcl directives (just for 
once test-case for now).  To me, this seems the most robust solution.


It this approach acceptable?

Thanks,
- Tom
[libgomp/testsuite] Fix libgomp.fortran/examples-4/declare_target-1.f90

---
 .../examples-4/declare_target-1.f90| 31 +-
 .../libgomp.fortran/examples-4/on_device_arch.c|  3 +++
 2 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90 b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
index 03c5c53ed67..acded20f756 100644
--- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
@@ -1,16 +1,6 @@
 ! { dg-do run }
-! { dg-additional-options "-cpp" }
-! Reduced from 25 to 23, otherwise execution runs out of thread stack on
-! Nvidia Titan V.
-! Reduced from 23 to 22, otherwise execution runs out of thread stack on
-! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
-! Reduced from 22 to 20, otherwise execution runs out of thread stack on
-! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
-! { dg-additional-options "-DREC_DEPTH=20" { target { offload_target_nvptx } } } */
-
-#ifndef REC_DEPTH
-#define REC_DEPTH 25
-#endif
+! { dg-additional-sources on_device_arch.c }
+! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
 
 module e_53_1_mod
   integer :: THRESHOLD = 20
@@ -38,6 +28,23 @@ end module
 
 program e_53_1
   use e_53_1_mod, only : fib, fib_wrapper
+  integer :: REC_DEPTH = 25
+
+  interface
+integer function on_device_arch_nvptx() bind(C)
+end function on_device_arch_nvptx
+  end interface
+
+  if (on_device_arch_nvptx () /= 0) then
+ ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+ ! Nvidia Titan V.
+ ! Reduced from 23 to 22, otherwise execution runs out of thread stack on
+ ! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
+ ! Reduced from 22 to 20, otherwise execution runs out of thread stack on
+ ! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
+ REC_DEPTH = 20
+  end if
+
   if (fib (15) /= fib_wrapper (15)) stop 1
   if (fib (REC_DEPTH) /= fib_wrapper (REC_DEPTH)) stop 2
 end program
diff --git a/libgomp/testsuite/libgomp.fortran/examples-4/on_device_arch.c b/libgomp/testsuite/libgomp.fortran/examples-4/on_device_arch.c
new file mode 100644
index 000..f8bef19e021
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/on_device_arch.c
@@ -0,0 +1,3 @@
+/* Auxiliar file.  */
+/* { dg-do compile  { target skip-all-targets } } */
+#include "../../libgomp.c-c++-common/on_device_arch.h"


Re: [committed] jit: further doc fixes

2022-04-01 Thread Eric Gallager via Gcc-patches
On Fri, Apr 1, 2022 at 9:28 AM David Malcolm via Gcc-patches
 wrote:
>
> Further jit doc fixes, which fix links to
> gcc_jit_function_type_get_param_type and gcc_jit_struct_get_field.
>
> I also regenerated libgccjit.texi (not included in the diff below).
>
> Tested with "make html" and with a bootstrap.

 Could you test with `make pdf` and `make dvi` too, to see if this fixes 102824?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102824

> Committed to trunk as r12-7959-g1a172da8a3f362.
>
> gcc/jit/ChangeLog:
> * docs/topics/expressions.rst: Fix formatting.
> * docs/topics/types.rst: Likewise.
> * docs/_build/texinfo/libgccjit.texi: Regenerate
>
> Signed-off-by: David Malcolm 
> ---
>  gcc/jit/docs/topics/expressions.rst | 8 
>  gcc/jit/docs/topics/types.rst   | 6 +++---
>  2 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/jit/docs/topics/expressions.rst 
> b/gcc/jit/docs/topics/expressions.rst
> index 9267b6d2ad6..d51264af73f 100644
> --- a/gcc/jit/docs/topics/expressions.rst
> +++ b/gcc/jit/docs/topics/expressions.rst
> @@ -24,7 +24,7 @@ Rvalues
>  ---
>  .. type:: gcc_jit_rvalue
>
> -A :c:type:`gcc_jit_rvalue *` is an expression that can be computed.
> +A :c:type:`gcc_jit_rvalue` is an expression that can be computed.
>
>  It can be simple, e.g.:
>
> @@ -602,7 +602,7 @@ Function calls
>gcc_jit_rvalue_set_bool_require_tail_call (gcc_jit_rvalue 
> *call,\
>   int 
> require_tail_call)
>
> -   Given an :c:type:`gcc_jit_rvalue *` for a call created through
> +   Given an :c:type:`gcc_jit_rvalue` for a call created through
> :c:func:`gcc_jit_context_new_call` or
> :c:func:`gcc_jit_context_new_call_through_ptr`, mark/clear the
> call as needing tail-call optimization.  The optimizer will
> @@ -721,8 +721,8 @@ where the rvalue is computed by reading from the storage 
> area.
>
>#ifdef LIBGCCJIT_HAVE_gcc_jit_lvalue_set_tls_model
>
> -.. function:: void
> -  gcc_jit_lvalue_set_link_section (gcc_jit_lvalue *lvalue,
> +.. function:: void\
> +  gcc_jit_lvalue_set_link_section (gcc_jit_lvalue *lvalue,\
> const char *section_name)
>
> Set the link section of a variable.
> diff --git a/gcc/jit/docs/topics/types.rst b/gcc/jit/docs/topics/types.rst
> index 9779ad26b6f..c2082c0ef3e 100644
> --- a/gcc/jit/docs/topics/types.rst
> +++ b/gcc/jit/docs/topics/types.rst
> @@ -192,7 +192,7 @@ A compound type analagous to a C `struct`.
>
>  A field within a :c:type:`gcc_jit_struct`.
>
> -You can model C `struct` types by creating :c:type:`gcc_jit_struct *` and
> +You can model C `struct` types by creating :c:type:`gcc_jit_struct` and
>  :c:type:`gcc_jit_field` instances, in either order:
>
>  * by creating the fields, then the structure.  For example, to model:
> @@ -375,7 +375,7 @@ Reflection API
>   Given a function type, return its number of parameters.
>
>  .. function::  gcc_jit_type *\
> -   gcc_jit_function_type_get_param_type (gcc_jit_function_type 
> *function_type,
> +   gcc_jit_function_type_get_param_type (gcc_jit_function_type 
> *function_type,\
>   size_t index)
>
>   Given a function type, return the type of the specified parameter.
> @@ -417,7 +417,7 @@ Reflection API
>   alignment qualifiers.
>
>  .. function::  gcc_jit_field *\
> -   gcc_jit_struct_get_field (gcc_jit_struct *struct_type,
> +   gcc_jit_struct_get_field (gcc_jit_struct *struct_type,\
>   size_t index)
>
>   Get a struct field by index.
> --
> 2.26.3
>


[wwwdocs PATCH] document zero-width field ABI changes on MIPS

2022-04-01 Thread Xi Ruoyao via Gcc-patches
Document PR102024 change (r12-7961 and 7962) for MIPS.  Ok for wwwdocs?

--

 htdocs/gcc-12/changes.html | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 4e1f6b0f..a2d8156f 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -50,6 +50,10 @@ a work-in-progress.
 (so there is a C++ ABI incompatibility, GCC 4.4 and earlier compatible
 with GCC 12 or later, incompatible with GCC 4.5 through GCC 11).
 RISC-V has changed the handling of these already starting with GCC 10.
+As the ABI requires, MIPS takes them into account handling function
+return values so there is a C++ ABI incompatibility with GCC 4.5
+through 11.  For function arguments on MIPS, refer to
+the MIPS specific entry.
 GCC 12 on the above targets will report such incompatibilities as
 warnings or other diagnostics unless -Wno-psabi is used.
   
@@ -549,7 +553,18 @@ a work-in-progress.
   
 
 
-
+MIPS
+
+  The ABI passing arguments
+  containing zero-width fields (for example, C/C++ zero-width
+  bit-fields, GNU C/C++ zero-length arrays, and GNU C empty structs)
+  has changed.  Now a zero-width field will not prevent an aligned
+  64-bit floating-point field next to it from being passed through
+  FPR.  This is compatible with LLVM, but incompatible with previous
+  GCC releases. GCC 12 on MIPS will report such incompatibilities as
+  an inform unless -Wno-psabi is used.
+  
+
 
 
 
-- 
2.35.1




[PATCH] Replace UNSPEC with RTL code for extendditi2.

2022-04-01 Thread Michael Meissner via Gcc-patches
eplace UNSPEC with RTL code for extendditi2.

When I submitted my patch on March 12th for extendditi2, Segher wished I
had removed the use of the UNSPEC for the vextsd2q instruction.  This
patch rewrites extendditi2_vector to use VEC_SELECT rather than UNSPEC.

I have built a power10 little endian toolchain, power9 little endian toolchain,
and a power8 big endian toolchain.  There were no regressions with this
change.  Is it ok to commit to the master branch?  I don't see the need to back
port the change, but I can certainly do so if desired.

2022-03-31   Michael Meissner  

gcc/
* config/rs6000/vsx.md (UNSPEC_EXTENDDITI2): Delete.
(extendditi2_vector): Rewrite to use VEC_SELECT as a
define_expand.
(extendditi2_vector2): New insn.
---
 gcc/config/rs6000/vsx.md | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index a1a1ce95195..c091e5e2f47 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -358,7 +358,6 @@ (define_c_enum "unspec"
UNSPEC_VSX_FIRST_MISMATCH_EOS_INDEX
UNSPEC_XXGENPCV
UNSPEC_MTVSBM
-   UNSPEC_EXTENDDITI2
UNSPEC_VCNTMB
UNSPEC_VEXPAND
UNSPEC_VEXTRACT
@@ -5083,10 +5082,25 @@ (define_insn_and_split "extendditi2"
(set_attr "type" "shift,load,vecmove,vecperm,load")])
 
 ;; Sign extend 64-bit value in TI reg, word 1, to 128-bit value in TI reg
-(define_insn "extendditi2_vector"
+(define_expand "extendditi2_vector"
+  [(use (match_operand:TI 0 "gpc_reg_operand"))
+   (use (match_operand:TI 1 "gpc_reg_operand"))]
+  "TARGET_POWER10"
+{
+  rtx dest = operands[0];
+  rtx src_v2di = gen_lowpart (V2DImode, operands[1]);
+  rtx element = GEN_INT (VECTOR_ELEMENT_SCALAR_64BIT);
+
+  emit_insn (gen_extendditi2_vector2 (dest, src_v2di, element));
+  DONE;
+})
+
+(define_insn "extendditi2_vector2"
   [(set (match_operand:TI 0 "gpc_reg_operand" "=v")
-   (unspec:TI [(match_operand:TI 1 "gpc_reg_operand" "v")]
-UNSPEC_EXTENDDITI2))]
+   (sign_extend:TI
+(vec_select:DI
+ (match_operand:V2DI 1 "gpc_reg_operand" "v")
+ (parallel [(match_operand 2 "vsx_scalar_64bit" "wD")]]
   "TARGET_POWER10"
   "vextsd2q %0,%1"
   [(set_attr "type" "vecexts")])
-- 
2.35.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH 2/4] Make vsx_splat__reg use correct insn attributes, PR target/99293

2022-04-01 Thread Segher Boessenkool
On Wed, Mar 30, 2022 at 06:41:59PM -0400, Michael Meissner wrote:
> On Mon, Mar 28, 2022 at 03:28:39PM -0500, Segher Boessenkool wrote:
> > On Mon, Mar 28, 2022 at 12:27:05PM -0400, Michael Meissner wrote:
> > > In looking at PR target/99293, I noticed that the code in the insn
> > > vsx_splat__reg used "vecmove" as the "type" insn attribute when the
> > > "mtvsrdd" is generated.  It should use "mfvsr".  I also added a "p9v" isa
> > > attribute for that alternative.
> > 
> > s/mfvsr/mtvsr/
> > 
> > But, mtvsrd and mtvsrdd have very different scheduling properties (like,
> > on p10 it is 1 cycle vs. 3 cycles).
> 
> I must admit, I assumed vecmove was a stand-in for XXMR (i.e. XXLOR).

That is "veclogical".  I don't think there is any core where this is
optimised specially?

> Since
> its use is used for other cases (mtvsrdd, xxsel/vsel, x{s,v}abs*, x{s,v}nabs*,
> xsiexpq*), it is probably better to just let things lie, and perhaps relook at
> it in the GCC 13 time frame.

Yes, we need to make better categories.  The problem is to come up with
something that is close enough to what the relevant cores actually do,
but in such a way that we do not end up with gazillions of nonsensical
separate instruction types.

What we care about most for p9 and p10 vector insns is whether something
is a 3-cycle op or not.  But this differs per core, and in ways that
are a little ad-hoc (looked at from far away anyway).

For the integer insns we ended up with extra attributes (not just
"type"), which is both compact and expressive.  We should try to do
something like that for vector ops as well.  We now have both p9 an
p10, with two implementations it should be clearer what a good direction
to take will be here.

> > Also, there are two insn patterns for mtvsrdd, and you are only touching
> > one here.
> 
> I think you meant that comment about the third patch (to vsx_extract_)
> and not to this patch (to vsx_splat__reg) where there are only two
> alternatives (the first being xxpermdi and the second being mtvsrdd).

I mean vsx_concat_ and vsx_splat__reg.  Both have mtvsrdd
(both as alternative 1), but you only update the "type" of the latter
here.

> > > --- a/gcc/config/rs6000/vsx.md
> > > +++ b/gcc/config/rs6000/vsx.md
> > > @@ -4580,7 +4580,8 @@ (define_insn "vsx_splat__reg"
> > >"@
> > > xxpermdi %x0,%x1,%x1,0
> > > mtvsrdd %x0,%1,%1"
> > > -  [(set_attr "type" "vecperm,vecmove")])
> > > +  [(set_attr "type" "vecperm,mtvsr")
> > > +   (set_attr "isa" "*,p9v")])
> > 
> > "we" requires "p9v".  Please do a full conversion when getting rid of
> > this?  That includes requiring TARGET_POWERPC64 for it (not -m64 as its
> > documentation says; the existing implementation of "we" is correct).
> 
> That is more complex, and likely it should be a GCC 13 thing.

Yes.

> Off the top of
> my head, we would need a new "isa" variant (p9v64) that combines p9v and
> 64-bit.

Not at all no.  Things that *use* the "isa" attribute can use other
attributes as well, if they want.  The reason we have "p9v" is because
it is so common that a shorthand helps (and *all* p9 vector insns need
either it or separate stuff).

> Originally, I had changed the "we" to "wa", but then I realized it
> wouldn't work for 32-bit, but I left in setting the alternative.

Yeah, when I got rid of many of the w* things I left mostly the harder
ones for later.  Sorry!


Segher


Re: [PATCH] c++: implicit guides should inherit class constraints [PR104873]

2022-04-01 Thread Jason Merrill via Gcc-patches

On 4/1/22 11:17, Patrick Palka wrote:

An implicit guide already inherits the (rewritten) constraints of the
constructor.  Thus it seems natural that the guide must also inherit
the constraints of the class template, since a constructor's constraints
might assume the class's constraints are satisfied, and therefore
checking these two sets of constraints "out of order" may result in hard
errors as in the first testcase below.



This patch makes implicit guides inherit the constraints of the class
template (even for unconstrained constructors, and even for the copy
deduction candidate).

In passing, this patch gives implicit guides a trailing return type
since that's how they're depicted in the standard (e.g.
[over.match.class.deduct]/6); this changes the order of substitution
into implicit guides in a probably negligible way, especially now that
they inherit the class constraints.

The parameter_mapping_equivalent_p change is to avoid an ICE in the last
testcase below (described within), reduced from a cmcstl2 testsuite ICE.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look like
the right approach?


I don't think so, given the testcases below.

Maybe fn_type_unification should check formation of the return type of a 
deduction guide before constraints?  In general, whichever order you do 
things in, it'll be wrong for some testcase or other.


The broader subject of constraints and deduction guides should be raised 
with CWG in general (https://github.com/cplusplus/CWG/issues/new/choose)



PR c++/104873

gcc/cp/ChangeLog:

* constraint.cc (parameter_mapping_equivalent_p): Relax assert
to expect equivalence not identity of template parameters.
* pt.cc (build_deduction_guide): Propagate the class's
constraints to the deduction guide.  Set TYPE_HAS_LATE_RETURN_TYPE
on the function type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ctad5.C: New test.
* g++.dg/cpp2a/concepts-ctad6.C: New test.
* g++.dg/cpp2a/concepts-ctad6a.C: New test.
* g++.dg/cpp2a/concepts-ctad7.C: New test.
---
  gcc/cp/constraint.cc |  2 +-
  gcc/cp/pt.cc | 26 ++
  gcc/testsuite/g++.dg/cpp2a/concepts-ctad5.C  | 29 
  gcc/testsuite/g++.dg/cpp2a/concepts-ctad6.C  | 19 +
  gcc/testsuite/g++.dg/cpp2a/concepts-ctad6a.C | 19 +
  gcc/testsuite/g++.dg/cpp2a/concepts-ctad7.C  | 26 ++
  6 files changed, 120 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ctad5.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ctad6.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ctad6a.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ctad7.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 94f6222b436..6cbb182dda2 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -604,7 +604,7 @@ parameter_mapping_equivalent_p (tree t1, tree t2)
tree map2 = ATOMIC_CONSTR_MAP (t2);
while (map1 && map2)
  {
-  gcc_checking_assert (TREE_VALUE (map1) == TREE_VALUE (map2));
+  gcc_checking_assert (cp_tree_equal (TREE_VALUE (map1), TREE_VALUE 
(map2)));
tree arg1 = TREE_PURPOSE (map1);
tree arg2 = TREE_PURPOSE (map2);
if (!template_args_equal (arg1, arg2))
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 75ed9a34018..966e6d90d3a 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -29261,6 +29261,10 @@ build_deduction_guide (tree type, tree ctor, tree 
outer_args, tsubst_flags_t com
/* Discard the 'this' parameter.  */
fparms = FUNCTION_ARG_CHAIN (ctor);
fargs = TREE_CHAIN (DECL_ARGUMENTS (ctor));
+  /* The guide's constraints consist of the class template's constraints
+followed by the constructor's rewritten constraints.  We start
+with the constructor's constraints (since we need to rewrite them),
+and prepend the class template's constraints later.  */
ci = get_constraints (ctor);
loc = DECL_SOURCE_LOCATION (ctor);
explicit_p = DECL_NONCONVERTING_P (ctor);
@@ -29362,6 +29366,27 @@ build_deduction_guide (tree type, tree ctor, tree 
outer_args, tsubst_flags_t com
return error_mark_node;
  }
  
+  /* Prepend the class template's constraints to the constructor's rewritten

+ constraints (if any).  */
+  if (tree class_ci = get_constraints (CLASSTYPE_TI_TEMPLATE (type)))
+{
+  if (outer_args)
+   {
+ /* FIXME: As above.  */
+ ++processing_template_decl;
+ class_ci = tsubst_constraint_info (class_ci, outer_args,
+complain, ctor);
+ --processing_template_decl;
+   }
+  if (ci)
+   ci = build_constraints (combine_constraint_expressions
+   (CI_TEMPLATE_REQS (class_ci),
+CI_TEMPLA

Re: [PATCH] c-family: Tweak -Woverflow diagnostic

2022-04-01 Thread Jason Merrill via Gcc-patches

On 3/30/22 18:28, Marek Polacek wrote:

When g++ emits

warning: overflow in conversion from 'int' to 'char' changes value from '300' 
to '',''

for code like "char c = 300;" it might raise a few eyebrows.  With this
warning we're not interested in the ASCII representation of the char, only
the numerical value, so convert constants of type char to int.  It looks
like this conversion only needs to be done for char_type_node.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?  I'm also happy
to defer this to GCC 13.


OK for stage 1.


gcc/c-family/ChangeLog:

* c-warn.cc (warnings_for_convert_and_check): Convert constants of type
char to int.

gcc/testsuite/ChangeLog:

* c-c++-common/Wconversion-1.c: New test.
---
  gcc/c-family/c-warn.cc | 16 +++-
  gcc/testsuite/c-c++-common/Wconversion-1.c | 14 ++
  2 files changed, 25 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/c-c++-common/Wconversion-1.c

diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index f24ac5d0539..cae89294aea 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -1404,8 +1404,14 @@ warnings_for_convert_and_check (location_t loc, tree 
type, tree expr,
  result = TREE_OPERAND (result, 1);
  
bool cst = TREE_CODE_CLASS (TREE_CODE (result)) == tcc_constant;

-
tree exprtype = TREE_TYPE (expr);
+  tree result_diag;
+  /* We're interested in the actual numerical value here, not its ASCII
+ representation.  */
+  if (cst && TYPE_MAIN_VARIANT (TREE_TYPE (result)) == char_type_node)
+result_diag = fold_convert (integer_type_node, result);
+  else
+result_diag = result;
  
if (TREE_CODE (expr) == INTEGER_CST

&& (TREE_CODE (type) == INTEGER_TYPE
@@ -1430,7 +1436,7 @@ warnings_for_convert_and_check (location_t loc, tree 
type, tree expr,
  "changes value from %qE to %qE")
 : G_("unsigned conversion from %qT to %qT "
  "changes value from %qE to %qE")),
-   exprtype, type, expr, result);
+   exprtype, type, expr, result_diag);
  else
warning_at (loc, OPT_Woverflow,
(TYPE_UNSIGNED (exprtype)
@@ -1449,7 +1455,7 @@ warnings_for_convert_and_check (location_t loc, tree 
type, tree expr,
warning_at (loc, OPT_Woverflow,
"overflow in conversion from %qT to %qT "
"changes value from %qE to %qE",
-   exprtype, type, expr, result);
+   exprtype, type, expr, result_diag);
  else
warning_at (loc, OPT_Woverflow,
"overflow in conversion from %qT to %qT "
@@ -1466,7 +1472,7 @@ warnings_for_convert_and_check (location_t loc, tree 
type, tree expr,
warning_at (loc, OPT_Woverflow,
"overflow in conversion from %qT to %qT "
"changes value from %qE to %qE",
-   exprtype, type, expr, result);
+   exprtype, type, expr, result_diag);
  else
warning_at (loc, OPT_Woverflow,
"overflow in conversion from %qT to %qT "
@@ -1483,7 +1489,7 @@ warnings_for_convert_and_check (location_t loc, tree 
type, tree expr,
warning_at (loc, OPT_Woverflow,
"overflow in conversion from %qT to %qT "
"changes value from %qE to %qE",
-   exprtype, type, expr, result);
+   exprtype, type, expr, result_diag);
else
warning_at (loc, OPT_Woverflow,
"overflow in conversion from %qT to %qT "
diff --git a/gcc/testsuite/c-c++-common/Wconversion-1.c 
b/gcc/testsuite/c-c++-common/Wconversion-1.c
new file mode 100644
index 000..ed65918c70f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wconversion-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-Wconversion" } */
+
+typedef char T;
+
+void g()
+{
+  char c = 300; /* { dg-warning "conversion from .int. to .char. changes value from 
.300. to .44." } */
+  T t = 300; /* { dg-warning "conversion from .int. to .T. {aka .char.} changes 
value from .300. to .44." } */
+  signed char sc = 300; /* { dg-warning "conversion from .int. to .signed char. 
changes value from .300. to .44." } */
+  unsigned char uc = 300; /* { dg-warning "conversion from .int. to .unsigned char. 
changes value from .300. to .44." } */
+  unsigned char uc2 = 300u; /* { dg-warning "conversion from .unsigned int. to 
.unsigned char. changes value from .300. to .44." } */
+  char c2 = (double)1.0 + 200; /* { dg-warning "overflow in conversion from 
.double. to .char. changes value from .2.01e\\+2. to .127." } */
+}

base-commit: b4e4b35f4ebe561826489bed971324efc99c5423




Re: [PATCH] c++: deduction for dependent class type of NTTP [PR105110]

2022-04-01 Thread Jason Merrill via Gcc-patches

On 3/30/22 17:51, Patrick Palka wrote:

Here deduction for the P/A pair V/a spuriously fails with

   types ‘A’ and ‘const A’ have incompatible cv-qualifiers

because the argument type is const, whereas the parameter type is
non-const.

Since the type of an NTTP is always cv-unqualified, it seems natural to
ignore cv-qualifiers on the argument type before attempting to unify the
two types.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


PR c++/105110

gcc/cp/ChangeLog:

* pt.cc (unify) : Ignore cv-quals on
on the argument type of an NTTP before deducing from it.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class52.C: New test.
---
  gcc/cp/pt.cc |  5 +++--
  gcc/testsuite/g++.dg/cpp2a/nontype-class52.C | 13 +
  2 files changed, 16 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class52.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 1acb5990c5c..cdd75d3b6ac 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -24271,8 +24271,9 @@ unify (tree tparms, tree targs, tree parm, tree arg, 
int strict,
  && !(strict & UNIFY_ALLOW_INTEGER)
  && TEMPLATE_PARM_LEVEL (parm) <= TMPL_ARGS_DEPTH (targs))
{
- /* Deduce it from the non-type argument.  */
- tree atype = TREE_TYPE (arg);
+ /* Deduce it from the non-type argument.  As above, ignore
+top-level quals here too.  */
+ tree atype = cv_unqualified (TREE_TYPE (arg));
  RECUR_AND_CHECK_FAILURE (tparms, targs,
   tparm, atype,
   UNIFY_ALLOW_NONE, explain_p);
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class52.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class52.C
new file mode 100644
index 000..56163376afb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class52.C
@@ -0,0 +1,13 @@
+// PR c++/105110
+// { dg-do compile { target c++20 } }
+
+template struct A { };
+
+template struct B { };
+
+template V> void f(B);
+
+int main() {
+  constexpr A a;
+  f(B{});
+}




Re: [patch]update the documentation for TARGET_ZERO_CALL_USED_REGS hook and add an assertion

2022-04-01 Thread Qing Zhao via Gcc-patches
FYI. 

I have committed the change to upstream as:

31933f4f788b6cd64cbb7ee42076997f6d0fe212

Qing
> On Mar 31, 2022, at 8:10 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>> Hi, 
>> 
>> Per our discussion on: 
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592002.html
>> 
>> I come up with the following patch to:
>> 
>> 1. Update the documentation for TARGET_ZERO_CALL_USED_REGS hook;
>> 2. Add an assertion in function.cc to make sure the actually zeroed_regs is 
>> a subset of all call used regs;
>>   (The reason I didn’t add a new parameter to TARGET_ZERO_CALL_USED_REGS is, 
>> I think adding the 
>>assertion in the common place function.cc is simpler to be implemented).
> 
> Yeah, that's fair.  I guess in theory, passing the parameter would allow
> targets to choose between two versions of zeroing the register, one with
> a temporary and one without.  But that's a purely hypothetical situation
> and we could always add a parameter later if that turns out to be useful.
> 
> Perhaps more realistically, there might be other uses of the hook in
> future that want to zero registers for different reasons, with their
> own rules about which registers can be zeroed.  In other words, the
> hook is providing a general facility that happens to be useful for
> -fzero-call-used-regs.  But again, we can deal with that if it ever
> happens.
> 
> So I agree this is the right call, especially for stage 4.
> 
>> 3. This new assertion identified a bug in i386 implementation. Fix this bug 
>> in i386.
>> 
>> This patch is bootstrapped on both x86 and aarch64, no regression.
>> 
>> Okay for commit?
> 
> OK for the non-x86 bits.
> 
> Thanks,
> Richard
> 
>> thanks.
>> 
>> Qing
>> 
>> ===
>> From 2e5bc1b25a707c6a17afbf03da2a8bec5b03454d Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Fri, 18 Mar 2022 20:49:56 +
>> Subject: [PATCH] Add an assertion: the zeroed_hardregs set is a subset of all
>> call used regs.
>> 
>> We should make sure that the hard register set that is actually cleared by
>> the target hook zero_call_used_regs should be a subset of all call used
>> registers.
>> 
>> At the same time, update documentation for the target hook
>> TARGET_ZERO_CALL_USED_REGS.
>> 
>> This new assertion identified a bug in the i386 implemenation, which
>> incorrectly set the zeroed_hardregs for stack registers. Fixed this bug
>> in i386 implementation.
>> 
>> gcc/ChangeLog:
>> 
>> 2022-03-21  Qing Zhao  
>> 
>>  * config/i386/i386.cc (zero_all_st_registers): Return the value of
>>  num_of_st.
>>  (ix86_zero_call_used_regs): Update zeroed_hardregs set according to
>>  the return value of zero_all_st_registers.
>>  * doc/tm.texi: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
>>  * function.cc (gen_call_used_regs_seq): Add an assertion.
>>  * target.def: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
>> ---
>> gcc/config/i386/i386.cc | 27 ++-
>> gcc/doc/tm.texi |  7 +++
>> gcc/function.cc | 22 ++
>> gcc/target.def  |  7 +++
>> 4 files changed, 50 insertions(+), 13 deletions(-)
>> 
>> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
>> index 5a561966eb44..d84047a4bc1b 100644
>> --- a/gcc/config/i386/i386.cc
>> +++ b/gcc/config/i386/i386.cc
>> @@ -3753,16 +3753,17 @@ zero_all_vector_registers (HARD_REG_SET 
>> need_zeroed_hardregs)
>>needs to be cleared, the whole stack should be cleared.  However,
>>x87 stack registers that hold the return value should be excluded.
>>x87 returns in the top (two for complex values) register, so
>> -   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.  */
>> +   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.
>> +   return the value of num_of_st.  */
>> 
>> 
>> -static bool
>> +static int
>> zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>> {
>> 
>>   /* If the FPU is disabled, no need to zero all st registers.  */
>>   if (! (TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387))
>> -return false;
>> +return 0;
>> 
>>   unsigned int num_of_st = 0;
>>   for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>> @@ -3774,7 +3775,7 @@ zero_all_st_registers (HARD_REG_SET 
>> need_zeroed_hardregs)
>>   }
>> 
>>   if (num_of_st == 0)
>> -return false;
>> +return 0;
>> 
>>   bool return_with_x87 = false;
>>   return_with_x87 = (crtl->return_rtx
>> @@ -3802,7 +3803,7 @@ zero_all_st_registers (HARD_REG_SET 
>> need_zeroed_hardregs)
>>   insn = emit_insn (gen_rtx_SET (st_reg, st_reg));
>>   add_reg_note (insn, REG_DEAD, st_reg);
>> }
>> -  return true;
>> +  return num_of_st;
>> }
>> 
>> 
>> @@ -3851,7 +3852,7 @@ ix86_zero_call_used_regs (HARD_REG_SET 
>> need_zeroed_hardregs)
>> {
>>   HARD_REG_SET zeroed_hardregs;
>>   bool all_sse_zeroed = false;
>> -  bool all_st_zeroed = false;
>> +  int all_st_zeroed_num = 0;
>>   

Re: [PATCH] c++: Fix ICE due to shared BLOCK node in coroutine generation [PR103328]

2022-04-01 Thread Jason Merrill via Gcc-patches

On 3/30/22 09:06, Benno Evers via Gcc-patches wrote:

From: Benno Evers 

When finishing a function that is a coroutine, the function is
transformed into a "ramp" function, and the original user-provided
function body gets moved into a newly created "actor" function.

In this case `current_function_decl` points to the ramp function,
but `current_binding_level->blocks` would still point to the
scope block of the user-provided function body in the actor function,
so when the ramp function was finished during `poplevel()` in decl.cc,
we could end up with that block being reused as the `DECL_INITIAL()` of
the ramp function:

 subblocks = functionbody >= 0 ? current_binding_level->blocks : 0;
 // [...]
 DECL_INITIAL (current_function_decl) = block ? block : subblocks;

This block would then be independently modified by subsequent passes
touching either the ramp or the actor function, potentially causing
an ICE depending on the order and function of these passes.

gcc/cp/ChangeLog:

 PR c++/103328
 * coroutines.cc (morph_fn_to_coro): Reset
   current_binding_level->blocks.

gcc/testsuite/ChangeLog:

 PR c++/103328
 * g++.dg/coroutines/pr103328.C: New test.

Co-Authored-By: Iain Sandoe 


Looks like you also need a DCO sign-off; see

https://gcc.gnu.org/contribute.html#legal

for more information.


---
  gcc/cp/coroutines.cc   |  3 ++
  gcc/testsuite/g++.dg/coroutines/pr103328.C | 32 ++
  2 files changed, 35 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr103328.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 23dc28271a4..ece30c905e8 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4541,6 +4541,9 @@ morph_fn_to_coro (tree orig, tree *resumer, tree
*destroyer)


gmail is breaking your patch with word wrap; see

https://www.kernel.org/doc/html/v4.17/process/email-clients.html

for information about ways to work around this, or just use an attachment.


BLOCK_VARS (top_block) = BIND_EXPR_VARS (ramp_bind);
BLOCK_SUBBLOCKS (top_block) = NULL_TREE;

+  /* Reset the current binding level to the ramp function */
+  current_binding_level->blocks = top_block;
+
/* The decl_expr for the coro frame pointer, initialize to zero so that we
   can pass it to the IFN_CO_FRAME (since there's no way to pass a type,
   directly apparently).  This avoids a "used uninitialized" warning.  */
diff --git a/gcc/testsuite/g++.dg/coroutines/pr103328.C
b/gcc/testsuite/g++.dg/coroutines/pr103328.C
new file mode 100644
index 000..56fb54ab316
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr103328.C
@@ -0,0 +1,32 @@
+// { dg-additional-options "-g" }
+
+#include 
+
+struct task {
+  struct promise_type {
+task get_return_object() { return {}; }
+std::suspend_never initial_suspend() { return {}; }
+std::suspend_never final_suspend() noexcept { return {}; }
+void unhandled_exception() {}
+  };
+  bool await_ready() { return false; }
+  void await_suspend(std::coroutine_handle<> h) {}
+  void await_resume() {}
+};
+
+template 
+void call(Func func) { func(); }
+
+class foo {
+  void f();
+  task g();
+};
+
+void foo::f() {
+  auto lambda = [this]() noexcept -> task {
+  co_await g();
+  };
+  (void)call;
+}
+
+int main() {}




[PATCH] c, c++: attribute format on a ctor with a vbase [PR101833, PR47634]

2022-04-01 Thread Marek Polacek via Gcc-patches
Attribute format takes three arguments: archetype, string-index, and
first-to-check.  The last two specify the position in the function
parameter list.  r63030 clarified that "Since non-static C++ methods have
an implicit this argument, the arguments of such methods should be counted
from two, not one, when giving values for string-index and first-to-check."
Therefore one has to write

  struct D {
D(const char *, ...) __attribute__((format(printf, 2, 3)));
  };

However -- and this is the problem in this PR -- ctors with virtual
bases also get two additional parameters: the in-charge parameter and
the VTT parameter (added in maybe_retrofit_in_chrg).  In fact we'll end up
with two clones of the ctor: an in-charge and a not-in-charge version (see
build_cdtor_clones).  That means that the argument position the user
specified in the attribute argument will refer to different arguments,
depending on which constructor we're currently dealing with.  This can
cause a range of problems: wrong errors, confusing warnings, or crashes.

This patch corrects that; for C we don't have to do anything, and in C++
we can use num_artificial_parms_for.  It would be wrong to rewrite the
attributes the user supplied, so I've added an extra parameter called
adjust_pos.

Attribute format_arg is not affected, because it requires that the
function returns "const char *" which will never be the case for cdtors.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/101833
PR c++/47634

gcc/c-family/ChangeLog:

* c-attribs.cc (positional_argument): Add new argument adjust_pos,
use it.
* c-common.cc (check_function_arguments): Pass fndecl to
check_function_format.
* c-common.h (check_function_format): Adjust declaration.
(maybe_adjust_arg_pos_for_attribute): Add.
(positional_argument): Adjust declaration.
* c-format.cc (decode_format_attr): Add fndecl argument.  Pass it to
maybe_adjust_arg_pos_for_attribute.  Adjust calls to get_constant.
(handle_format_arg_attribute): Pass 0 to get_constant.
(get_constant): Add new argument adjust_pos, use it.
(check_function_format): Add fndecl argument.  Pass it to
decode_format_attr.
(handle_format_attribute): Get the fndecl from node[2].  Pass it to
decode_format_attr.

gcc/c/ChangeLog:

* c-objc-common.cc (maybe_adjust_arg_pos_for_attribute): New.

gcc/cp/ChangeLog:

* tree.cc (maybe_adjust_arg_pos_for_attribute): New.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-format-arg1.C: New test.
* g++.dg/ext/attr-format1.C: New test.
* g++.dg/ext/attr-format2.C: New test.
* g++.dg/ext/attr-format3.C: New test.
---
 gcc/c-family/c-attribs.cc   | 14 ---
 gcc/c-family/c-common.cc|  4 +-
 gcc/c-family/c-common.h |  5 ++-
 gcc/c-family/c-format.cc| 46 +
 gcc/c/c-objc-common.cc  |  9 
 gcc/cp/tree.cc  | 19 +
 gcc/testsuite/g++.dg/ext/attr-format-arg1.C | 26 
 gcc/testsuite/g++.dg/ext/attr-format1.C | 32 ++
 gcc/testsuite/g++.dg/ext/attr-format2.C | 38 +
 gcc/testsuite/g++.dg/ext/attr-format3.C | 15 +++
 10 files changed, 182 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/attr-format-arg1.C
 create mode 100644 gcc/testsuite/g++.dg/ext/attr-format1.C
 create mode 100644 gcc/testsuite/g++.dg/ext/attr-format2.C
 create mode 100644 gcc/testsuite/g++.dg/ext/attr-format3.C

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 111a33f405a..6e17847ec9e 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -599,12 +599,15 @@ attribute_takes_identifier_p (const_tree attr_id)
matching all C integral types except bool.  If successful, return
POS after default conversions, if any.  Otherwise, issue appropriate
warnings and return null.  A non-zero 1-based ARGNO should be passed
-   in by callers only for attributes with more than one argument.  */
+   in by callers only for attributes with more than one argument.
+   ADJUST_POS is used and non-zero in C++ when the function type has
+   invisible parameters generated by the compiler, such as the in-charge
+   or VTT parameters.  */
 
 tree
 positional_argument (const_tree fntype, const_tree atname, tree pos,
 tree_code code, int argno /* = 0 */,
-int flags /* = posargflags () */)
+int flags /* = posargflags () */, int adjust_pos /* = 0 */)
 {
   if (pos && TREE_CODE (pos) != IDENTIFIER_NODE
   && TREE_CODE (pos) != FUNCTION_DECL)
@@ -690,7 +693,7 @@ positional_argument (const_tree fntype, const_tree atname, 
tree pos,
   if (!nargs
   || !tree_fits_uhwi_p (pos)
   || ((flags & POSARG_ELLIPSIS) == 0
- && !

-Wformat-overflow handling for %b and %B directives in C2X standard

2022-04-01 Thread Frolov Daniil via Gcc-patches
Hello, I've noticed that -Wformat-overflow doesn't handle %b and %B
directives in the sprintf function. I've added a relevant issue in bugzilla
(bug #105129).
I attach a patch with a possible solution to the letter.
From 2051344e9500651f6e94c44cbc7820715382b957 Mon Sep 17 00:00:00 2001
From: Frolov Daniil 
Date: Fri, 1 Apr 2022 00:47:03 +0500
Subject: [PATCH] Support %b, %B for -Wformat-overflow (sprintf, snprintf)

testsuite: add tests to check -Wformat-overflow on %b.
Wformat-overflow1.c is compiled using -std=c2x so warning has to
be throwed

Wformat-overflow2.c doesn't throw warnings cause c2x std isn't
used

gcc/ChangeLog:

	* gimple-ssa-sprintf.cc
(check_std_c2x): New function
	(fmtresult::type_max_digits): add base == 2 handling
	(tree_digits): add handle for base == 2
	(format_integer): now handle %b and %B using base = 2
	(parse_directive): add cases to handle %b and %B directives
	(compute_format_length): add handling for base = 2

gcc/testsuite/ChangeLog:

	* gcc.dg/Wformat-overflow1.c: New test. (using -std=c2x)
	* gcc.dg/Wformat-overflow2.c: New test. (-std=c11 no warning)
---
 gcc/gimple-ssa-sprintf.cc| 42 
 gcc/testsuite/gcc.dg/Wformat-overflow1.c | 28 
 gcc/testsuite/gcc.dg/Wformat-overflow2.c | 16 +
 3 files changed, 79 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/Wformat-overflow1.c
 create mode 100644 gcc/testsuite/gcc.dg/Wformat-overflow2.c

diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc
index c93f12f90b5..7f68c2b6e51 100644
--- a/gcc/gimple-ssa-sprintf.cc
+++ b/gcc/gimple-ssa-sprintf.cc
@@ -107,6 +107,15 @@ namespace {
 
 static int warn_level;
 
+/* b_overflow_flag depends on the current standart when using gcc */
+static bool b_overflow_flag;
+
+/* check is current standart version equals C2X*/
+static bool check_std_c2x () 
+{
+  return !strcmp (lang_hooks.name, "GNU C2X");
+}
+
 /* The minimum, maximum, likely, and unlikely maximum number of bytes
of output either a formatting function or an individual directive
can result in.  */
@@ -535,6 +544,8 @@ fmtresult::type_max_digits (tree type, int base)
   unsigned prec = TYPE_PRECISION (type);
   switch (base)
 {
+case 2:
+  return prec;
 case 8:
   return (prec + 2) / 3;
 case 10:
@@ -857,11 +868,11 @@ tree_digits (tree x, int base, HOST_WIDE_INT prec, bool plus, bool prefix)
 
   /* Adjust a non-zero value for the base prefix, either hexadecimal,
  or, unless precision has resulted in a leading zero, also octal.  */
-  if (prefix && absval && (base == 16 || prec <= ndigs))
+  if (prefix && absval && (base == 2 || base == 16 || prec <= ndigs))
 {
   if (base == 8)
 	res += 1;
-  else if (base == 16)
+  else if (base == 16 || base == 2) /*0x...(0X...) and 0b...(0B...)*/
 	res += 2;
 }
 
@@ -1229,6 +1240,10 @@ format_integer (const directive &dir, tree arg, pointer_query &ptr_qry)
 case 'u':
   base = 10;
   break;
+case 'b':
+case 'B':
+  base = 2;
+  break;
 case 'o':
   base = 8;
   break;
@@ -1351,10 +1366,10 @@ format_integer (const directive &dir, tree arg, pointer_query &ptr_qry)
 
   /* Bump up the counters if WIDTH is greater than LEN.  */
   res.adjust_for_width_or_precision (dir.width, dirtype, base,
-	 (sign | maybebase) + (base == 16));
+	 (sign | maybebase) + (base == 2 || base == 16));
   /* Bump up the counters again if PRECision is greater still.  */
   res.adjust_for_width_or_precision (dir.prec, dirtype, base,
-	 (sign | maybebase) + (base == 16));
+	 (sign | maybebase) + (base == 2 || base == 16));
 
   return res;
 }
@@ -1503,7 +1518,7 @@ format_integer (const directive &dir, tree arg, pointer_query &ptr_qry)
 	  if (res.range.min == 1)
 	res.range.likely += base == 8 ? 1 : 2;
 	  else if (res.range.min == 2
-		   && base == 16
+		   && (base == 16 || base == 2)
 		   && (dir.width[0] == 2 || dir.prec[0] == 2))
 	++res.range.likely;
 	}
@@ -1511,9 +1526,9 @@ format_integer (const directive &dir, tree arg, pointer_query &ptr_qry)
 
   res.range.unlikely = res.range.max;
   res.adjust_for_width_or_precision (dir.width, dirtype, base,
- (sign | maybebase) + (base == 16));
+ (sign | maybebase) + (base == 2 || base == 16));
   res.adjust_for_width_or_precision (dir.prec, dirtype, base,
- (sign | maybebase) + (base == 16));
+ (sign | maybebase) + (base == 2 || base == 16));
 
   return res;
 }
@@ -3680,6 +3695,8 @@ parse_directive (call_info &info,
   ++pf;
   break;
 }
+  
+  
 
   switch (target_to_host (*pf))
 {
@@ -3713,6 +3730,14 @@ parse_directive (call_info &info,
 case 'X':
   dir.fmtfunc = format_integer;
   break;
+
+case 'b':
+case 'B':
+  if (b_overflow_flag) {
+dir.fmtfunc = format_integer;
+break;
+  }
+  return 0;
 
 case 'p':
   /* The %p output i

[PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations

2022-04-01 Thread David Faust via Gcc-patches
Hello,

This patch series is a first attempt at adding support for:

- Two new C-language-level attributes that allow to associate (to "tag")
  particular declarations and types with arbitrary strings. As explained below,
  this is intended to be used to, for example, characterize certain pointer
  types.

- The conveyance of that information in the DWARF output in the form of a new
  DIE: DW_TAG_GNU_annotation.

- The conveyance of that information in the BTF output in the form of two new
  kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.

All of these facilities are being added to the eBPF ecosystem, and support for
them exists in some form in LLVM. However, as we shall see, we have found some
problems implementing them so some discussion is in order.

Purpose
===

1)  Addition of C-family language constructs (attributes) to specify free-text
tags on certain language elements, such as struct fields.

The purpose of these annotations is to provide additional information about
types, variables, and function paratemeters of interest to the kernel. A
driving use case is to tag pointer types within the linux kernel and eBPF
programs with additional semantic information, such as '__user' or '__rcu'.

For example, consider the linux kernel function do_execve with the
following declaration:

  static int do_execve(struct filename *filename,
 const char __user *const __user *__argv,
 const char __user *const __user *__envp);

Here, __user could be defined with these annotations to record semantic
information about the pointer parameters (e.g., they are user-provided) in
DWARF and BTF information. Other kernel facilites such as the eBPF verifier
can read the tags and make use of the information.

2)  Conveying the tags in the generated DWARF debug info.

The main motivation for emitting the tags in DWARF is that the Linux kernel
generates its BTF information via pahole, using DWARF as a source:

++  BTF  BTF   +--+
| pahole |---> vmlinux.btf --->| verifier |
++ +--+
^^
||
  DWARF |BTF |
||
 vmlinux  +-+
 module1.ko   | BPF program |
 module2.ko   +-+
   ...

This is because:

a)  Unlike GCC, LLVM will only generate BTF for BPF programs.

b)  GCC can generate BTF for whatever target with -gbtf, but there is no
support for linking/deduplicating BTF in the linker.

In the scenario above, the verifier needs access to the pointer tags of
both the kernel types/declarations (conveyed in the DWARF and translated
to BTF by pahole) and those of the BPF program (available directly in BTF).

Another motivation for having the tag information in DWARF, unrelated to
BPF and BTF, is that the drgn project (another DWARF consumer) also wants
to benefit from these tags in order to differentiate between different
kinds of pointers in the kernel.

3)  Conveying the tags in the generated BTF debug info.

This is easy: the main purpose of having this info in BTF is for the
compiled eBPF programs. The kernel verifier can then access the tags
of pointers used by the eBPF programs.


For more information about these tags and the motivation behind them, please
refer to the following linux kernel discussions:

  https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/
  https://lore.kernel.org/bpf/20211012164838.3345699-1-...@fb.com/
  https://lore.kernel.org/bpf/2022012604.1504583-1-...@fb.com/


What is in this patch series


This patch series adds support for these annotations in GCC. The implementation
is largely complete. However, in some cases the produced debug info (both DWARF
and BTF) differs significantly from that produced by LLVM. This issue is
discussed in detail below, along with a few specific questions for both GCC and
LLVM. Any input would be much appreciated.


Implementation Overview
===

To enable these annotations, two new C language attributes are added:
__attribute__((btf_decl_tag("foo")) and __attribute__((btf_type_tag("bar"))).
Both attributes accept a single arbitrary string constant argument, which will
be recorded in the generated DWARF and/or BTF debugging information. They have
no effect on code generation.

Note that we are using the same attribute names as LLVM, which include "btf"
in the name. This may be controversial, as these tags are not really
BTF-specific. A different name may be more appropriate. There was much
discussion about naming in the proposal for the functionali

[PATCH 1/8] dwarf: Add dw_get_die_parent function

2022-04-01 Thread David Faust via Gcc-patches
gcc/

* dwarf2out.cc (dw_get_die_parent): New function.
* dwarf2out.h (dw_get_die_parent): Declare it here.
---
 gcc/dwarf2out.cc | 8 
 gcc/dwarf2out.h  | 1 +
 2 files changed, 9 insertions(+)

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index 5681b01749a..35322fb5f6e 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -5235,6 +5235,14 @@ dw_get_die_sib (dw_die_ref die)
   return die->die_sib;
 }
 
+/* Return a reference to the parent of a given DIE.  */
+
+dw_die_ref
+dw_get_die_parent (dw_die_ref die)
+{
+  return die->die_parent;
+}
+
 /* Add an address constant attribute value to a DIE.  When using
dwarf_split_debug_info, address attributes in dies destined for the
final executable should be direct references--setting the parameter
diff --git a/gcc/dwarf2out.h b/gcc/dwarf2out.h
index 656ef94afde..e6962fb4848 100644
--- a/gcc/dwarf2out.h
+++ b/gcc/dwarf2out.h
@@ -455,6 +455,7 @@ extern dw_die_ref lookup_type_die (tree);
 
 extern dw_die_ref dw_get_die_child (dw_die_ref);
 extern dw_die_ref dw_get_die_sib (dw_die_ref);
+extern dw_die_ref dw_get_die_parent (dw_die_ref);
 extern enum dwarf_tag dw_get_die_tag (dw_die_ref);
 
 /* Data about a single source file.  */
-- 
2.35.1



[PATCH 2/8] include: Add BTF tag defines to dwarf2 and btf

2022-04-01 Thread David Faust via Gcc-patches
include/

* btf.h: Add BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG defines. Update
comments.
(struct btf_decl_tag): New.
* dwarf2.def: Add new DWARF extension DW_TAG_GNU_annotation.
---
 include/btf.h  | 17 +++--
 include/dwarf2.def |  4 
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/include/btf.h b/include/btf.h
index 78b551ced23..37deaef8b48 100644
--- a/include/btf.h
+++ b/include/btf.h
@@ -69,7 +69,7 @@ struct btf_type
 
   /* SIZE is used by INT, ENUM, STRUCT, UNION, DATASEC kinds.
  TYPE is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, FUNC,
- FUNC_PROTO and VAR kinds.  */
+ FUNC_PROTO, VAR and DECL_TAG kinds.  */
   union
   {
 uint32_t size; /* Size of the entire type, in bytes.  */
@@ -109,7 +109,9 @@ struct btf_type
 #define BTF_KIND_VAR   14  /* Variable.  */
 #define BTF_KIND_DATASEC   15  /* Section such as .bss or .data.  */
 #define BTF_KIND_FLOAT 16  /* Floating point.  */
-#define BTF_KIND_MAX   BTF_KIND_FLOAT
+#define BTF_KIND_DECL_TAG  17  /* Decl Tag.  */
+#define BTF_KIND_TYPE_TAG  18  /* Type Tag.  */
+#define BTF_KIND_MAX   BTF_KIND_TYPE_TAG
 #define NR_BTF_KINDS   (BTF_KIND_MAX + 1)
 
 /* For some BTF_KINDs, struct btf_type is immediately followed by
@@ -190,6 +192,17 @@ struct btf_var_secinfo
   uint32_t size;   /* Size (in bytes) of variable.  */
 };
 
+/* BTF_KIND_DECL_TAG is followed by a single struct btf_decl_tag, which
+   describes the tag location:
+   - If component_idx == -1, then the tag is applied to a struct, union,
+ variable or function.
+   - Otherwise it is applied to a struct/union member or function argument
+ with the given given index numbered 0..vlen-1.  */
+struct btf_decl_tag
+{
+  int32_t component_idx;
+};
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/include/dwarf2.def b/include/dwarf2.def
index 4214c80907a..e054890130a 100644
--- a/include/dwarf2.def
+++ b/include/dwarf2.def
@@ -174,6 +174,10 @@ DW_TAG (DW_TAG_GNU_formal_parameter_pack, 0x4108)
are properly part of DWARF 5.  */
 DW_TAG (DW_TAG_GNU_call_site, 0x4109)
 DW_TAG (DW_TAG_GNU_call_site_parameter, 0x410a)
+
+/* Extension for BTF annotations.  */
+DW_TAG (DW_TAG_GNU_annotation, 0x6000)
+
 /* Extensions for UPC.  See: http://dwarfstd.org/doc/DWARF4.pdf.  */
 DW_TAG (DW_TAG_upc_shared_type, 0x8765)
 DW_TAG (DW_TAG_upc_strict_type, 0x8766)
-- 
2.35.1



[PATCH 3/8] c-family: Add BTF tag attribute handlers

2022-04-01 Thread David Faust via Gcc-patches
This patch adds attribute handlers in GCC for two attributes already
supported in LLVM: "btf_decl_tag" and "btf_type_tag". Both attributes
accept a single string constant argument, and are used to add arbitrary
annotations to debug information generated for the types/decls to which
they apply.

gcc/c-family/

* c-attribs.cc (c_common_attribute_table): Add new attributes
btf_decl_tag and btf_type_tag.
(handle_btf_decl_tag_attribute): New.
(handle_btf_type_tag_attribute): Likewise.
---
 gcc/c-family/c-attribs.cc | 45 +++
 1 file changed, 45 insertions(+)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 111a33f405a..ec52f6defb4 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -174,6 +174,9 @@ static tree handle_signed_bool_precision_attribute (tree *, 
tree, tree, int,
bool *);
 static tree handle_retain_attribute (tree *, tree, tree, int, bool *);
 
+static tree handle_btf_decl_tag_attribute (tree *, tree, tree, int, bool *);
+static tree handle_btf_type_tag_attribute (tree *, tree, tree, int, bool *);
+
 /* Helper to define attribute exclusions.  */
 #define ATTR_EXCL(name, function, type, variable)  \
   { name, function, type, variable }
@@ -555,6 +558,12 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_dealloc_attribute, NULL },
   { "tainted_args",  0, 0, true,  false, false, false,
  handle_tainted_args_attribute, NULL },
+
+  { "btf_type_tag",   1, 1, false, true, false, false,
+ handle_btf_type_tag_attribute, NULL },
+  { "btf_decl_tag",   1, 1, false, false, false, false,
+ handle_btf_decl_tag_attribute, NULL },
+
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
 };
 
@@ -5854,6 +5863,42 @@ handle_tainted_args_attribute (tree *node, tree name, 
tree, int,
   return NULL_TREE;
 }
 
+/* Handle a "btf_decl_tag" attribute; arguments as in
+   struct attribute_spec.handler.   */
+
+static tree
+handle_btf_decl_tag_attribute (tree *, tree name, tree args, int,
+  bool *no_add_attrs)
+{
+  if (!args)
+*no_add_attrs = true;
+  else if (TREE_CODE (TREE_VALUE (args)) != STRING_CST)
+{
+  error ("%qE attribute requires a string", name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
+/* Handle a "btf_type_tag" attribute; arguments as in
+   struct attribute_spec.handler.   */
+
+static tree
+handle_btf_type_tag_attribute (tree *, tree name, tree args, int,
+  bool *no_add_attrs)
+{
+  if (!args)
+*no_add_attrs = true;
+  else if (TREE_CODE (TREE_VALUE (args)) != STRING_CST)
+{
+  error ("%qE attribute requires a string", name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
 /* Attempt to partially validate a single attribute ATTR as if
it were to be applied to an entity OPER.  */
 
-- 
2.35.1



[PATCH 4/8] dwarf: create BTF decl and type tag DIEs

2022-04-01 Thread David Faust via Gcc-patches
The "btf_decl_tag" and "btf_type_tag" attributes are handled by
constructing DW_TAG_LLVM_annotation DIEs. The DIEs are children of the
declarations or types which they annotate, and convey the annotation via
a string constant.

Currently, all generation of these DIEs is gated behind
btf_debuginfo_p (). That is, they will not be generated nor output
unless BTF debug information is generated. The DIEs will be output in
DWARF if both -gbtf and -gdwarf are supplied by the user.

gcc/

* dwarf2out.cc (gen_btf_decl_tag_dies): New function.
(gen_btf_type_tag_dies): Likewise.
(modified_type_die): Call them here, if appropriate.
(gen_formal_parameter_die): Likewise.
(gen_typedef_die): Likewise.
(gen_type_die): Likewise.
(gen_decl_die): Likewise.
---
 gcc/dwarf2out.cc | 102 +++
 1 file changed, 102 insertions(+)

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index 35322fb5f6e..8f59213f96e 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -13612,6 +13612,78 @@ long_double_as_float128 (tree type)
   return NULL_TREE;
 }
 
+/* BTF support. Given a tree T, which may be a decl or a type, process any
+   "btf_decl_tag" attributes on T, provided in ATTR. Construct
+   DW_TAG_GNU_annotation DIEs appropriately as children of TARGET, usually
+   the DIE for T.  */
+
+static void
+gen_btf_decl_tag_dies (tree t, dw_die_ref target)
+{
+  dw_die_ref die;
+  tree attr;
+
+  if (t == NULL_TREE || !target)
+return;
+
+  if (TYPE_P (t))
+attr = lookup_attribute ("btf_decl_tag", TYPE_ATTRIBUTES (t));
+  else if (DECL_P (t))
+attr = lookup_attribute ("btf_decl_tag", DECL_ATTRIBUTES (t));
+  else
+/* This is an error.  */
+gcc_unreachable ();
+
+  while (attr != NULL_TREE)
+{
+  die = new_die (DW_TAG_GNU_annotation, target, t);
+  add_name_attribute (die, IDENTIFIER_POINTER (get_attribute_name (attr)));
+  add_AT_string (die, DW_AT_const_value,
+TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr;
+  attr = TREE_CHAIN (attr);
+}
+
+  /* Strip the decl tag attribute to avoid creating multiple copies if we hit
+ this tree node again in some recursive call.  */
+  if (TYPE_P (t))
+TYPE_ATTRIBUTES (t) =
+  remove_attribute ("btf_decl_tag", TYPE_ATTRIBUTES (t));
+  else if (DECL_P (t))
+DECL_ATTRIBUTES (t) =
+  remove_attribute ("btf_decl_tag", DECL_ATTRIBUTES (t));
+}
+
+/* BTF support. Given a tree TYPE, process any "btf_type_tag" attributes on
+   TYPE. Construct DW_TAG_GNU_annotation DIEs appropriately as children of
+   TARGET, usually the DIE for TYPE.  */
+
+static void
+gen_btf_type_tag_dies (tree type, dw_die_ref target)
+{
+  dw_die_ref die;
+  tree attr;
+
+  if (type == NULL_TREE || !target)
+return;
+
+  gcc_assert (TYPE_P (type));
+
+  attr = lookup_attribute ("btf_type_tag", TYPE_ATTRIBUTES (type));
+  while (attr != NULL_TREE)
+{
+  die = new_die (DW_TAG_GNU_annotation, target, type);
+  add_name_attribute (die, IDENTIFIER_POINTER (get_attribute_name (attr)));
+  add_AT_string (die, DW_AT_const_value,
+TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr;
+  attr = TREE_CHAIN (attr);
+}
+
+  /* Strip the type tag attribute to avoid creating multiple copies if we hit
+ this type again in some recursive call.  */
+  TYPE_ATTRIBUTES (type) =
+remove_attribute ("btf_type_tag", TYPE_ATTRIBUTES (type));
+}
+
 /* Given a pointer to an arbitrary ..._TYPE tree node, return a debugging
entry that chains the modifiers specified by CV_QUALS in front of the
given type.  REVERSE is true if the type is to be interpreted in the
@@ -14010,6 +14082,10 @@ modified_type_die (tree type, int cv_quals, bool 
reverse,
   if (TYPE_ARTIFICIAL (type))
 add_AT_flag (mod_type_die, DW_AT_artificial, 1);
 
+  /* BTF support. Handle any "btf_type_tag" attributes on the type.  */
+  if (btf_debuginfo_p ())
+gen_btf_type_tag_dies (type, mod_type_die);
+
   return mod_type_die;
 }
 
@@ -22986,6 +23062,10 @@ gen_formal_parameter_die (tree node, tree origin, bool 
emit_name_p,
   gcc_unreachable ();
 }
 
+  /* BTF Support */
+  if (btf_debuginfo_p ())
+gen_btf_decl_tag_dies (node, parm_die);
+
   return parm_die;
 }
 
@@ -26060,6 +26140,10 @@ gen_typedef_die (tree decl, dw_die_ref context_die)
 
   if (get_AT (type_die, DW_AT_name))
 add_pubtype (decl, type_die);
+
+  /* BTF: handle attribute btf_decl_tag which may appear on the typedef.  */
+  if (btf_debuginfo_p ())
+gen_btf_decl_tag_dies (decl, type_die);
 }
 
 /* Generate a DIE for a struct, class, enum or union type.  */
@@ -26373,6 +26457,20 @@ gen_type_die (tree type, dw_die_ref context_die)
  if (die)
check_die (die);
}
+
+  /* BTF support. Handle any "btf_type_tag" or "btf_decl_tag" attributes
+on the type, constructing annotation DIEs as appropriate.  */
+  if (btf_debuginfo_p

[PATCH 5/8] ctfc: Add support to pass through BTF annotations

2022-04-01 Thread David Faust via Gcc-patches
BTF generation currently relies on the internal CTF representation to
convert debug info from DWARF dies. This patch adds a new internal
header, "ctf-int.h", which defines CTF kinds to be used internally to
represent BTF tags which must pass through the CTF container. It also
adds a new type for representing information specific to those tags, and
a member for that type in ctf_dtdef.

This patch also updates ctf_add_reftype to accept a const char * name,
and add it for the newly added type.

gcc/

* ctf-int.h: New file.
* ctfc.cc (ctf_add_reftype): Add NAME parameter. Pass it to
ctf_add_generic call.
(ctf_add_pointer): Update ctf_add_reftype call accordingly.
* ctfc.h (ctf_add_reftype): Analogous change.
(ctf_btf_annotation): New.
(ctf_dtdef): Add member for it.
(enum ctf_dtu_d_union_enum): Likewise.
* dwarf2ctf.cc (gen_ctf_modifier_type): Update call to
ctf_add_reftype accordingly.
---
 gcc/ctf-int.h| 29 +
 gcc/ctfc.cc  | 11 +++
 gcc/ctfc.h   | 17 ++---
 gcc/dwarf2ctf.cc |  2 +-
 4 files changed, 51 insertions(+), 8 deletions(-)
 create mode 100644 gcc/ctf-int.h

diff --git a/gcc/ctf-int.h b/gcc/ctf-int.h
new file mode 100644
index 000..fb5f4aacad6
--- /dev/null
+++ b/gcc/ctf-int.h
@@ -0,0 +1,29 @@
+/* ctf-int.h - GCC internal definitions used for CTF debug info.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_CTF_INT_H
+#define GCC_CTF_INT_H 1
+
+/* These CTF kinds only exist as a bridge to generating BTF types for
+   BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. They do not correspond to any
+   representable type kind in CTF.  */
+#define CTF_K_DECL_TAG  62
+#define CTF_K_TYPE_TAG  63
+
+#endif /* GCC_CTF_INT_H */
diff --git a/gcc/ctfc.cc b/gcc/ctfc.cc
index 6fe44d2e8d4..031a6fff65d 100644
--- a/gcc/ctfc.cc
+++ b/gcc/ctfc.cc
@@ -107,6 +107,9 @@ ctf_dtu_d_union_selector (ctf_dtdef_ref ctftype)
   return CTF_DTU_D_ARGUMENTS;
 case CTF_K_SLICE:
   return CTF_DTU_D_SLICE;
+case CTF_K_DECL_TAG:
+case CTF_K_TYPE_TAG:
+  return CTF_DTU_D_BTFNOTE;
 default:
   /* The largest member as default.  */
   return CTF_DTU_D_ARRAY;
@@ -394,15 +397,15 @@ ctf_add_encoded (ctf_container_ref ctfc, uint32_t flag, 
const char * name,
 }
 
 ctf_id_t
-ctf_add_reftype (ctf_container_ref ctfc, uint32_t flag, ctf_id_t ref,
-uint32_t kind, dw_die_ref die)
+ctf_add_reftype (ctf_container_ref ctfc, uint32_t flag, const char * name,
+ctf_id_t ref, uint32_t kind, dw_die_ref die)
 {
   ctf_dtdef_ref dtd;
   ctf_id_t type;
 
   gcc_assert (ref <= CTF_MAX_TYPE);
 
-  type = ctf_add_generic (ctfc, flag, NULL, &dtd, die);
+  type = ctf_add_generic (ctfc, flag, name, &dtd, die);
   dtd->dtd_data.ctti_info = CTF_TYPE_INFO (kind, flag, 0);
   /* Caller of this API must guarantee that a CTF type with id = ref already
  exists.  This will also be validated for us at link-time.  */
@@ -514,7 +517,7 @@ ctf_id_t
 ctf_add_pointer (ctf_container_ref ctfc, uint32_t flag, ctf_id_t ref,
 dw_die_ref die)
 {
-  return (ctf_add_reftype (ctfc, flag, ref, CTF_K_POINTER, die));
+  return (ctf_add_reftype (ctfc, flag, NULL, ref, CTF_K_POINTER, die));
 }
 
 ctf_id_t
diff --git a/gcc/ctfc.h b/gcc/ctfc.h
index 18c93c802a0..51f43cd01cb 100644
--- a/gcc/ctfc.h
+++ b/gcc/ctfc.h
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dwarf2ctf.h"
 #include "ctf.h"
 #include "btf.h"
+#include "ctf-int.h"
 
 /* Invalid CTF type ID definition.  */
 
@@ -151,6 +152,13 @@ typedef struct GTY (()) ctf_func_arg
 
 #define ctf_farg_list_next(elem) ((ctf_func_arg_t *)((elem)->farg_next))
 
+/* BTF support: a BTF type tag or decl tag.  */
+
+typedef struct GTY (()) ctf_btf_annotation
+{
+  uint32_t component_idx;
+} ctf_btf_annotation_t;
+
 /* Type definition for CTF generation.  */
 
 struct GTY ((for_user)) ctf_dtdef
@@ -173,6 +181,8 @@ struct GTY ((for_user)) ctf_dtdef
 ctf_func_arg_t * GTY ((tag ("CTF_DTU_D_ARGUMENTS"))) dtu_argv;
 /* slice.  */
 ctf_sliceinfo_t GTY ((tag ("CTF_DTU_D_SLICE"))) dtu_slice;
+/* btf annotation.  */
+ctf_btf_annotation_t GTY ((tag ("CTF_DTU_D_BTFNOTE"))) dtu_btfnote;
   } dtd_u;
 };
 
@@ -212,7 +222,8 @@ enum ctf_dtu_d_union

[PATCH 6/8] dwarf2ctf: convert tag DIEs to CTF types

2022-04-01 Thread David Faust via Gcc-patches
This patch makes the DWARF-to-CTF conversion process aware of the new
DW_TAG_GNU_annotation DIEs. The DIEs are converted to CTF_K_DECL_TAG or
CTF_K_TYPE_TAG types as approprate and added to the compilation unit CTF
container.

gcc/

* dwarf2ctf.cc (handle_btf_tags): New function.
(gen_ctf_sou_type): Call it here, if appropriate. Don't try to
create member types for children that are not DW_TAG_member.
(gen_ctf_function_type): Call handle_btf_tags if appropriate.
(gen_ctf_variable): Likewise.
(gen_ctf_function): Likewise.
(gen_ctf_type): Likewise.
---
 gcc/dwarf2ctf.cc | 113 ++-
 1 file changed, 112 insertions(+), 1 deletion(-)

diff --git a/gcc/dwarf2ctf.cc b/gcc/dwarf2ctf.cc
index 32495cf4307..8811ec3e878 100644
--- a/gcc/dwarf2ctf.cc
+++ b/gcc/dwarf2ctf.cc
@@ -32,6 +32,12 @@ along with GCC; see the file COPYING3.  If not see
 static ctf_id_t
 gen_ctf_type (ctf_container_ref, dw_die_ref);
 
+static void
+gen_ctf_variable (ctf_container_ref, dw_die_ref);
+
+static void
+handle_btf_tags (ctf_container_ref, dw_die_ref, ctf_id_t, int);
+
 /* All the DIE structures we handle come from the DWARF information
generated by GCC.  However, there are three situations where we need
to create our own created DIE structures because GCC doesn't
@@ -547,6 +553,7 @@ gen_ctf_sou_type (ctf_container_ref ctfc, dw_die_ref sou, 
uint32_t kind)
   /* Now process the struct members.  */
   {
 dw_die_ref c;
+int idx = 0;
 
 c = dw_get_die_child (sou);
 if (c)
@@ -559,6 +566,12 @@ gen_ctf_sou_type (ctf_container_ref ctfc, dw_die_ref sou, 
uint32_t kind)
 
  c = dw_get_die_sib (c);
 
+ if (dw_get_die_tag (c) != DW_TAG_member)
+   continue;
+
+ if (c == dw_get_die_child (sou))
+   idx = 0;
+
  field_name = get_AT_string (c, DW_AT_name);
  field_type = ctf_get_AT_type (c);
  field_location = ctf_get_AT_data_member_location (c);
@@ -626,6 +639,12 @@ gen_ctf_sou_type (ctf_container_ref ctfc, dw_die_ref sou, 
uint32_t kind)
 field_name,
 field_type_id,
 field_location);
+
+ /* Handle BTF tags on the member.  */
+ if (btf_debuginfo_p ())
+   handle_btf_tags (ctfc, c, sou_type_id, idx);
+
+ idx++;
}
   while (c != dw_get_die_child (sou));
   }
@@ -716,6 +735,9 @@ gen_ctf_function_type (ctf_container_ref ctfc, dw_die_ref 
function,
  arg_type = gen_ctf_type (ctfc, ctf_get_AT_type (c));
  /* Add the argument to the existing CTF function type.  */
  ctf_add_function_arg (ctfc, function, arg_name, arg_type);
+
+ if (btf_debuginfo_p ())
+   handle_btf_tags (ctfc, c, function_type_id, i - 1);
}
  else
/* This is a local variable.  Ignore.  */
@@ -814,6 +836,11 @@ gen_ctf_variable (ctf_container_ref ctfc, dw_die_ref die)
   /* Generate the new CTF variable and update global counter.  */
   (void) ctf_add_variable (ctfc, var_name, var_type_id, die, external_vis);
   ctfc->ctfc_num_global_objts += 1;
+
+  /* Handle any BTF tags on the variable.  */
+  if (btf_debuginfo_p ())
+handle_btf_tags (ctfc, die, CTF_NULL_TYPEID, -1);
+
 }
 
 /* Add a CTF function record for the given input DWARF DIE.  */
@@ -831,8 +858,12 @@ gen_ctf_function (ctf_container_ref ctfc, dw_die_ref die)
  counter.  Note that DWARF encodes function types in both
  DW_TAG_subroutine_type and DW_TAG_subprogram in exactly the same
  way.  */
-  (void) gen_ctf_function_type (ctfc, die, true /* from_global_func */);
+  function_type_id = gen_ctf_function_type (ctfc, die, true /* 
from_global_func */);
   ctfc->ctfc_num_global_funcs += 1;
+
+  /* Handle any BTF tags on the function itself.  */
+  if (btf_debuginfo_p ())
+handle_btf_tags (ctfc, die, function_type_id, -1);
 }
 
 /* Add CTF type record(s) for the given input DWARF DIE and return its type id.
@@ -909,6 +940,10 @@ gen_ctf_type (ctf_container_ref ctfc, dw_die_ref die)
   break;
 }
 
+  /* Handle any BTF tags on the type.  */
+  if (btf_debuginfo_p () && !unrecog_die)
+handle_btf_tags (ctfc, die, type_id, -1);
+
   /* For all types unrepresented in CTF, use an explicit CTF type of kind
  CTF_K_UNKNOWN.  */
   if ((type_id == CTF_NULL_TYPEID) && (!unrecog_die))
@@ -917,6 +952,82 @@ gen_ctf_type (ctf_container_ref ctfc, dw_die_ref die)
   return type_id;
 }
 
+/* BTF support. Handle any BTF tags attached to a given DIE, and generate
+   intermediate CTF types for them. Type tags are inserted into the type chain
+   at this point. The return value is the CTF type ID of the last type tag
+   created (for type chaining), or the same as the argument TYPE_ID if there 
are
+   no type tags.
+   Note that despite the name, the BTF spec seems to allow decl tags on types
+   as well as dec

[PATCH 8/8] testsuite: Add tests for BTF tags

2022-04-01 Thread David Faust via Gcc-patches
This commit adds tests for the tags, in BTF and in DWARF.

gcc/teststuite/

* gcc.dg/debug/btf/btf-decltag-func.c: New test.
* gcc.dg/debug/btf/btf-decltag-sou.c: Likewise.
* gcc.dg/debug/btf/btf-decltag-typedef.c: Likewise.
* gcc.dg/debug/btf/btf-typetag-1.c: Likewise.
* gcc.dg/debug/dwarf2/annotation-1.c: Likewise.
---
 .../gcc.dg/debug/btf/btf-decltag-func.c   | 18 ++
 .../gcc.dg/debug/btf/btf-decltag-sou.c| 34 +++
 .../gcc.dg/debug/btf/btf-decltag-typedef.c| 15 
 .../gcc.dg/debug/btf/btf-typetag-1.c  | 20 +++
 .../gcc.dg/debug/dwarf2/annotation-1.c| 29 
 5 files changed, 116 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-1.c

diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
new file mode 100644
index 000..aa2c31aaa32
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
@@ -0,0 +1,18 @@
+
+/* { dg-do compile )  */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* { dg-final { scan-assembler-times "\[\t \]0x1100\[\t 
\]+\[^\n\]*btt_info" 4 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x\[\t 
\]+\[^\n\]*decltag_compidx" 3 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x1\[\t 
\]+\[^\n\]*decltag_compidx" 1 } } */
+
+#define __tag1 __attribute__((btf_decl_tag("decl-tag-1")))
+#define __tag2 __attribute__((btf_decl_tag("decl-tag-2")))
+#define __tag3 __attribute__((btf_decl_tag("decl-tag-3")))
+
+extern int bar (int __tag1, int __tag2) __tag3;
+
+int __tag1 __tag2 foo (int arg1, int *arg2 __tag2)
+  {
+return bar (arg1 + 1, *arg2 + 2);
+  }
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
new file mode 100644
index 000..be89d0d32de
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
@@ -0,0 +1,34 @@
+
+/* { dg-do compile )  */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* { dg-final { scan-assembler-times "\[\t \]0x1100\[\t 
\]+\[^\n\]*btt_info" 16 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0\[\t \]+\[^\n\]*decltag_compidx" 
2 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x1\[\t 
\]+\[^\n\]*decltag_compidx" 1 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x2\[\t 
\]+\[^\n\]*decltag_compidx" 3 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x3\[\t 
\]+\[^\n\]*decltag_compidx" 3 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x4\[\t 
\]+\[^\n\]*decltag_compidx" 1 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x\[\t 
\]+\[^\n\]*decltag_compidx" 6 } } */
+
+#define __tag1 __attribute__((btf_decl_tag("decl-tag-1")))
+#define __tag2 __attribute__((btf_decl_tag("decl-tag-2")))
+#define __tag3 __attribute__((btf_decl_tag("decl-tag-3")))
+
+struct t {
+  int a;
+  long b __tag3;
+  char c __tag2 __tag3;
+} __tag1 __tag2;
+
+struct t my_t __tag1 __tag3;
+
+
+union u {
+  char one __tag1 __tag2;
+  short two;
+  int three __tag1;
+  long four __tag1 __tag2 __tag3;
+  long long five __tag2;
+} __tag3;
+
+union u my_u __tag2;
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c
new file mode 100644
index 000..75be876f949
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c
@@ -0,0 +1,15 @@
+/* { dg-do compile )  */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* { dg-final { scan-assembler-times "\[\t \]0x1100\[\t 
\]+\[^\n\]*btt_info" 3 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x\[\t 
\]+\[^\n\]*decltag_compidx" 3 } } */
+
+#define __tag1 __attribute__((btf_decl_tag("decl-tag-1")))
+#define __tag2 __attribute__((btf_decl_tag("decl-tag-2")))
+#define __tag3 __attribute__((btf_decl_tag("decl-tag-3")))
+
+struct s { int a; } __tag1;
+
+typedef struct s * sptr __tag2;
+
+sptr my_sptr __tag3;
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c
new file mode 100644
index 000..4b05663385f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile )  */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* { dg-final { scan-assembler-times "\[\t \]0x1200\[\t 
\]+\[^\n\]*btt_info" 4 } } */
+
+#define __tag1 __attribute__((btf_type_tag("tag1")))
+#define __tag2 __attribute__((btf_type_tag("tag2")))
+#define __tag3 __attribute__((btf_type_tag("tag3")))
+
+int __tag1 * x;
+const int __tag2 * y;
+
+struct a;
+
+struct b
+{
+  struct a __tag2 __tag3 * inner_a;
+};
+
+struct b my_b;
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/a

[PATCH 7/8] Output BTF DECL_TAG and TYPE_TAG types

2022-04-01 Thread David Faust via Gcc-patches
This patch updates btfout.cc to be aware of the DECL_TAG and TYPE_TAG
kinds and output them appropriately.

gcc/

* btfout.cc (get_btf_kind): Handle TYPE_TAG and DECL_TAG kinds.
(btf_calc_num_vbytes): Likewise.
(btf_asm_type): Likewise.
(output_asm_btf_vlen_bytes): Likewise.
---
 gcc/btfout.cc | 28 
 1 file changed, 28 insertions(+)

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 31af50521da..f291cd925be 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -136,6 +136,8 @@ get_btf_kind (uint32_t ctf_kind)
 case CTF_K_VOLATILE: return BTF_KIND_VOLATILE;
 case CTF_K_CONST:return BTF_KIND_CONST;
 case CTF_K_RESTRICT: return BTF_KIND_RESTRICT;
+case CTF_K_TYPE_TAG: return BTF_KIND_TYPE_TAG;
+case CTF_K_DECL_TAG: return BTF_KIND_DECL_TAG;
 default:;
 }
   return BTF_KIND_UNKN;
@@ -201,6 +203,7 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)
 case BTF_KIND_CONST:
 case BTF_KIND_RESTRICT:
 case BTF_KIND_FUNC:
+case BTF_KIND_TYPE_TAG:
 /* These kinds have no vlen data.  */
   break;
 
@@ -238,6 +241,10 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)
   vlen_bytes += vlen * sizeof (struct btf_var_secinfo);
   break;
 
+case BTF_KIND_DECL_TAG:
+  vlen_bytes += sizeof (struct btf_decl_tag);
+  break;
+
 default:
   break;
 }
@@ -636,6 +643,22 @@ btf_asm_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
   dw2_asm_output_data (4, dtd->dtd_data.ctti_size, "btt_size: %uB",
   dtd->dtd_data.ctti_size);
   return;
+case BTF_KIND_DECL_TAG:
+  {
+   /* A decl tag might refer to (be the child DIE of) a variable. Try to
+  lookup the parent DIE's CTF variable, and if it exists point to the
+  corresponding BTF variable. This is an odd construction - we have a
+  'type' which refers to a variable, rather than the reverse.  */
+   dw_die_ref parent = dw_get_die_parent (dtd->dtd_key);
+   ctf_dvdef_ref dvd = ctf_dvd_lookup (ctfc, parent);
+   if (dvd)
+ {
+   unsigned int var_id =
+ *(btf_var_ids->get (dvd)) + num_types_added + 1;
+   dw2_asm_output_data (4, var_id, "btt_type");
+   return;
+ }
+  }
 default:
   break;
 }
@@ -949,6 +972,11 @@ output_asm_btf_vlen_bytes (ctf_container_ref ctfc, 
ctf_dtdef_ref dtd)
 at this point.  */
   gcc_unreachable ();
 
+case BTF_KIND_DECL_TAG:
+  dw2_asm_output_data (4, dtd->dtd_u.dtu_btfnote.component_idx,
+  "decltag_compidx");
+  break;
+
 default:
   /* All other BTF type kinds have no variable length data.  */
   break;
-- 
2.35.1



Re: -Wformat-overflow handling for %b and %B directives in C2X standard

2022-04-01 Thread Marek Polacek via Gcc-patches
On Sat, Apr 02, 2022 at 12:19:47AM +0500, Frolov Daniil via Gcc-patches wrote:
> Hello, I've noticed that -Wformat-overflow doesn't handle %b and %B
> directives in the sprintf function. I've added a relevant issue in bugzilla
> (bug #105129).
> I attach a patch with a possible solution to the letter.

Thanks for the patch.  Support for C2X %b, %B formats is relatively new
(Oct 2021) so it looks like gimple-ssa-sprintf.cc hasn't caught up.

This is not a regression, so should probably wait till GCC 13.  Anyway...

> From 2051344e9500651f6e94c44cbc7820715382b957 Mon Sep 17 00:00:00 2001
> From: Frolov Daniil 
> Date: Fri, 1 Apr 2022 00:47:03 +0500
> Subject: [PATCH] Support %b, %B for -Wformat-overflow (sprintf, snprintf)
> 
> testsuite: add tests to check -Wformat-overflow on %b.
> Wformat-overflow1.c is compiled using -std=c2x so warning has to
> be throwed
> 
> Wformat-overflow2.c doesn't throw warnings cause c2x std isn't
> used
> 
> gcc/ChangeLog:
> 
>   * gimple-ssa-sprintf.cc
> (check_std_c2x): New function
>   (fmtresult::type_max_digits): add base == 2 handling
>   (tree_digits): add handle for base == 2
>   (format_integer): now handle %b and %B using base = 2
>   (parse_directive): add cases to handle %b and %B directives
>   (compute_format_length): add handling for base = 2

The descriptions should start with a capital letter and end with a period,
like "Handle base == 2."
 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/Wformat-overflow1.c: New test. (using -std=c2x)
>   * gcc.dg/Wformat-overflow2.c: New test. (-std=c11 no warning)

You can just say "New test."

> ---
>  gcc/gimple-ssa-sprintf.cc| 42 
>  gcc/testsuite/gcc.dg/Wformat-overflow1.c | 28 
>  gcc/testsuite/gcc.dg/Wformat-overflow2.c | 16 +
>  3 files changed, 79 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/Wformat-overflow1.c
>  create mode 100644 gcc/testsuite/gcc.dg/Wformat-overflow2.c
> 
> diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc
> index c93f12f90b5..7f68c2b6e51 100644
> --- a/gcc/gimple-ssa-sprintf.cc
> +++ b/gcc/gimple-ssa-sprintf.cc
> @@ -107,6 +107,15 @@ namespace {
>  
>  static int warn_level;
>  
> +/* b_overflow_flag depends on the current standart when using gcc */

"standard"

/* Comments should be formatted like this.  */

> +static bool b_overflow_flag;
> +
> +/* check is current standart version equals C2X*/
> +static bool check_std_c2x () 
> +{
> +  return !strcmp (lang_hooks.name, "GNU C2X");
> +}

Is this really needed?  ISTM that this new checking shouldn't depend on
-std=c2x.  If not using C2X, you only get a warning if -Wpedantic.  So
I think you should remove b_overflow_flag.

>  /* The minimum, maximum, likely, and unlikely maximum number of bytes
> of output either a formatting function or an individual directive
> can result in.  */
> @@ -535,6 +544,8 @@ fmtresult::type_max_digits (tree type, int base)
>unsigned prec = TYPE_PRECISION (type);
>switch (base)
>  {
> +case 2:
> +  return prec;
>  case 8:
>return (prec + 2) / 3;
>  case 10:
> @@ -857,11 +868,11 @@ tree_digits (tree x, int base, HOST_WIDE_INT prec, bool 
> plus, bool prefix)
>  
>/* Adjust a non-zero value for the base prefix, either hexadecimal,
>   or, unless precision has resulted in a leading zero, also octal.  */
> -  if (prefix && absval && (base == 16 || prec <= ndigs))
> +  if (prefix && absval && (base == 2 || base == 16 || prec <= ndigs))
>  {
>if (base == 8)
>   res += 1;
> -  else if (base == 16)
> +  else if (base == 16 || base == 2) /*0x...(0X...) and 0b...(0B...)*/
>   res += 2;
>  }
>  
> @@ -1229,6 +1240,10 @@ format_integer (const directive &dir, tree arg, 
> pointer_query &ptr_qry)
>  case 'u':
>base = 10;
>break;
> +case 'b':
> +case 'B':
> +  base = 2;
> +  break;
>  case 'o':
>base = 8;
>break;
> @@ -1351,10 +1366,10 @@ format_integer (const directive &dir, tree arg, 
> pointer_query &ptr_qry)
>  
>/* Bump up the counters if WIDTH is greater than LEN.  */
>res.adjust_for_width_or_precision (dir.width, dirtype, base,
> -  (sign | maybebase) + (base == 16));
> +  (sign | maybebase) + (base == 2 || 
> base == 16));
>/* Bump up the counters again if PRECision is greater still.  */
>res.adjust_for_width_or_precision (dir.prec, dirtype, base,
> -  (sign | maybebase) + (base == 16));
> +  (sign | maybebase) + (base == 2 || 
> base == 16));
>  
>return res;
>  }
> @@ -1503,7 +1518,7 @@ format_integer (const directive &dir, tree arg, 
> pointer_query &ptr_qry)
> if (res.range.min == 1)
>   res.range.likely += base == 8 ? 1 : 2;
> else if (re

Re: [PATCH] rs6000: Adjust mov optabs for opaque modes [PR103353]

2022-04-01 Thread will schmidt via Gcc-patches
On Thu, 2022-03-03 at 16:38 +0800, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 

Hi

> As PR103353 shows, we may want to continue to expand a MMA built-in
> function like a normal function, even if we have already emitted
> error messages about some missing required conditions.  As shown in
> that PR, without one explicit mov optab on OOmode provided, it would
> call emit_move_insn recursively.
> 
> So this patch is to allow the mov pattern to be generated when we are
> expanding to RTL and have seen errors even without MMA supported, it's
> expected that the generated pattern would not cause further ICEs as the
> compilation would stop soon after expanding.

Is there a testcase, new or existing, that illustrates this error path?

> 
> Bootstrapped and regtested on powerpc64-linux-gnu P8 and
> powerpc64le-linux-gnu P9 and P10.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> --
> 
>   PR target/103353
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/mma.md (define_expand movoo): Move TARGET_MMA condition
>   check to preparation statements and add handlings for !TARGET_MMA.
>   (define_expand movxo): Likewise.

> > ---
> >  gcc/config/rs6000/mma.md | 42 ++--
> >  1 file changed, 36 insertions(+), 6 deletions(-)
> > 
> > diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
> > index 907c9d6d516..f76a87b4a21 100644
> > --- a/gcc/config/rs6000/mma.md
> > +++ b/gcc/config/rs6000/mma.md
> > @@ -268,10 +268,25 @@ (define_int_attr avvi4i4i4
> > [(UNSPEC_MMA_PMXVI8GER4PP   "pmxvi8ger4pp")
> >  (define_expand "movoo"
> >[(set (match_operand:OO 0 "nonimmediate_operand")
> > (match_operand:OO 1 "input_operand"))]
> > -  "TARGET_MMA"
> > +  ""
> >  {
> > -  rs6000_emit_move (operands[0], operands[1], OOmode);
> > -  DONE;
> > +  if (TARGET_MMA) {
> > +rs6000_emit_move (operands[0], operands[1], OOmode);
> > +DONE;
> > +  }
> > +  /* Opaque modes are only expected to be available when MMA is supported,
> > + but PR103353 shows we may want to continue to expand a MMA built-in
> > + function like a normal function, even if we have already emitted
> > + error messages about some missing required conditions.

perhaps drop "like a normal function".  


> > + As shown in that PR, without one explicit mov optab on OOmode 
> > provided,
> > + it would call emit_move_insn recursively.  So we allow this pattern to
> > + be generated when we are expanding to RTL and have seen errors, even
> > + though there is no MMA support.  It would not cause further ICEs as
> > + the compilation would stop soon after expanding.  */

Testcase would be particularly helpful to illustrate this, i think.  

TH
anks,
-Will

> > +  else if (currently_expanding_to_rtl && seen_error ())
> > +;
> > +  else
> > +gcc_unreachable ();
> >  })
> >  
> >  (define_insn_and_split "*movoo"
> > @@ -300,10 +315,25 @@ (define_insn_and_split "*movoo"
> >  (define_expand "movxo"
> >[(set (match_operand:XO 0 "nonimmediate_operand")
> > (match_operand:XO 1 "input_operand"))]
> > -  "TARGET_MMA"
> > +  ""
> >  {
> > -  rs6000_emit_move (operands[0], operands[1], XOmode);
> > -  DONE;
> > +  if (TARGET_MMA) {
> > +rs6000_emit_move (operands[0], operands[1], XOmode);
> > +DONE;
> > +  }
> > +  /* Opaque modes are only expected to be available when MMA is supported,
> > + but PR103353 shows we may want to continue to expand a MMA built-in
> > + function like a normal function, even if we have already emitted
> > + error messages about some missing required conditions.
> > + As shown in that PR, without one explicit mov optab on OOmode 
> > provided,
> > + it would call emit_move_insn recursively.  So we allow this pattern to
> > + be generated when we are expanding to RTL and have seen errors, even
> > + though there is no MMA support.  It would not cause further ICEs as
> > + the compilation would stop soon after expanding.  */
> > +  else if (currently_expanding_to_rtl && seen_error ())
> > +;
> > +  else
> > +gcc_unreachable ();
> >  })
> >  
> >  (define_insn_and_split "*movxo"
> > -- 
> > 2.25.1
> > 



Re: [PATCH] rs6000: Adjust mov optabs for opaque modes [PR103353]

2022-04-01 Thread Peter Bergner via Gcc-patches
On 4/1/22 3:50 PM, will schmidt wrote:
> Is there a testcase, new or existing, that illustrates this error path?

Well, the already existsing test case pr101849.c is where the issue was seen,
but only when compiled by hand outside of the test harness and using only the
-maltivec option and not the -mcpu=power10 that the test case uses.

That said, I agree, pr101849.c should probably be copied into another test case
file (pr103353.c) that uses options that trigger the issue so we can be sure the
fix won't regress.

Peter




Re: [committed] jit: further doc fixes

2022-04-01 Thread David Malcolm via Gcc-patches
On Fri, 2022-04-01 at 12:26 -0400, Eric Gallager wrote:
> On Fri, Apr 1, 2022 at 9:28 AM David Malcolm via Gcc-patches
>  wrote:
> > 
> > Further jit doc fixes, which fix links to
> > gcc_jit_function_type_get_param_type and gcc_jit_struct_get_field.
> > 
> > I also regenerated libgccjit.texi (not included in the diff below).
> > 
> > Tested with "make html" and with a bootstrap.
> 
>  Could you test with `make pdf` and `make dvi` too, to see if this
> fixes 102824?
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102824

FWIW it doesn't fix that, but I've added some notes to that bug with a
possible fix.

Dave