date:20230323

[PATCH] In the ready lists of pipeline, put unrecog insns (such as CLOBBER, USE) at the latest to issue.

2023-03-23 Thread Jin Ma via Gcc-patches

  Unrecog insns (such as CLOBBER, USE) does not represent real instructions, 
but in the
process of pipeline optimization, they will wait for transmission in ready list 
like
other insns, without considering resource conflicts and cycles. This results in 
a
multi-issue CPU architecture that can be issued at any time if other regular 
insns
have resource conflicts or cannot be launched for other reasons. As a result, 
its
position is advanced in the generated insns sequence, which will affect register
allocation and often lead to more redundant mov instructions.

gcc/ChangeLog:

* haifa-sched.cc (prune_ready_list): Consider unrecog insns(CLOBBER and 
USE)
in pruning ready lists.
---
 gcc/haifa-sched.cc | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
index 48b53776fa9..72c4c44da76 100644
--- a/gcc/haifa-sched.cc
+++ b/gcc/haifa-sched.cc
@@ -6318,6 +6318,14 @@ prune_ready_list (state_t temp_state, bool 
first_cycle_insn_p,
  cost = 1;
  reason = "not a shadow";
}
+ else if (recog_memoized (insn) < 0
+ && (GET_CODE (PATTERN (insn)) == CLOBBER
+ || GET_CODE (PATTERN (insn)) == USE))
+   {
+ if (!first_cycle_insn_p)
+   cost = 1;
+ reason = "unrecog insn";
+   }
  else if (recog_memoized (insn) < 0)
{
  if (!first_cycle_insn_p
-- 
2.17.1

Re: [PATCH] PR target/105325, Make load/cmp fusion know about prefixed loads

2023-03-23 Thread Kewen.Lin via Gcc-patches

Hi Mike,

Thanks for fixing this, some minor comments are inlined below.

on 2023/3/22 07:53, Michael Meissner wrote:
> The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion
> optimization generates illegal assembler code.
> 
> Ultimately the code was dying because the fusion load + compare -1/0/1 
> patterns
> did not handle the possibility that the load might be prefixed.
> 
> The main cause is the constraints for the individual loads in the fusion did 
> not
> match the machine.  In particular, LWA is a ds format instruction when it is
> unprefixed.  The code did not also set the prefixed attribute correctly.
> 
> This patch rewrites the genfusion.pl script so that it will have more accurate
> constraints for the LWA and LD instructions (which are DS instructions).  The
> updated genfusion.pl was then run to update fusion.md.  Finally, the code for
> the "prefixed" attribute is modified so that it considers load + compare
> immediate patterns to be like the normal load insns in checking whether
> operand[1] is a prefixed instruction.
> 
> I have tested this patch on a little endian power10 system, on a little endian
> power9 system, and a big endian power8 system (both -m32 and -m64 tested on
> BE).  There were no regressions, can I check this into the trunk?
> 
> The same patch applies to the gcc-12 and gcc-11 branches.  Can I check this
> patch into those branches also after a burn-in period?
> 
> 2023-03-21   Michael Meissner  
>  Aaron Sawdey  
> 
> gcc/
> 
>   PR target/105325
>   * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation
>   of the ld and lwa instructions which use the DS encoding instead of D.
>   Use the YZ constraint for these loads.  Handle prefixed loads better.
>   Set the sign_extend attribute as appropriate.
>   * gcc/config/rs6000/fusion.md: Regenerate.
>   * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi
>   instructions to the list of instructions that might have a prefixed load
>   instruction.
> 
> gcc/testsuite/
> 
>   PR target/105325
>   * g++.target/powerpc/pr105325.C: New test.
>   * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.
> ---
>  gcc/config/rs6000/genfusion.pl| 26 ---
>  gcc/config/rs6000/fusion.md   | 17 +++-
>  gcc/config/rs6000/rs6000.md   |  2 +-
>  gcc/testsuite/g++.target/powerpc/pr105325.C   | 24 +
>  .../gcc.target/powerpc/fusion-p10-ldcmpi.c|  4 +--
>  5 files changed, 59 insertions(+), 14 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/powerpc/pr105325.C
> 
> diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
> index e4db352e0ce..4f367cadc52 100755
> --- a/gcc/config/rs6000/genfusion.pl
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -56,7 +56,7 @@ sub mode_to_ldst_char
>  sub gen_ld_cmpi_p10
>  {
>  my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred,
> - $mempred, $ccmode, $np, $extend, $resultmode);
> + $mempred, $ccmode, $np, $extend, $resultmode, $constraint);
>LMODE: foreach $lmode ('DI','SI','HI','QI') {
>$ldst = mode_to_ldst_char($lmode);
>$clobbermode = $lmode;
> @@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10
>CCMODE: foreach $ccmode ('CC','CCUNS') {
> $np = "NON_PREFIXED_D";
> $mempred = "non_update_memory_operand";
> +   $constraint = "m";

The three assignments on $np $mempred $constraint can be moved
to place (a) (see below) and add one explicit assignment for
$constraint at place (b), since for the condition ccmode eq 'CC',
HI/SI/DI have their own settings (btw QI is skipped), these
assignments for default value can be moved to else arm (for CCUNS).

> if ( $ccmode eq 'CC' ) {
> next CCMODE if $lmode eq 'QI';
> -   if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
> +   if ( $lmode eq 'HI' ) {
> +   $np = "NON_PREFIXED_D";
> +   $mempred = "non_update_memory_operand";
> +   $echr = "a";
  // place b
  $constraint = "m";
   
> +   } elsif ( $lmode eq 'SI' ) {
> +   # ld and lwa are both DS-FORM.
  
we have broken it into two different arms for SI and DI, this
comment can be removed?

> +   $np = "NON_PREFIXED_DS";
> +   $mempred = "lwa_operand";
> +   $echr = "a";
> +   $constraint = "YZ";
> +   } elsif ( $lmode eq 'DI' ) {
> # ld and lwa are both DS-FORM.

... and this comment.

> $np = "NON_PREFIXED_DS";
> $mempred = "ds_form_mem_operand";
> +   $echr = "";
> +   $constraint = "YZ";
> }
> $cmpl = "";
> -   $echr = "a";
> $constpred = "const_m1_to_1_operand";
> } else {

  // place a

> if (

[PATCH] tree-vect-generic: Fix up expand_vector_condition [PR109176]

2023-03-23 Thread Jakub Jelinek via Gcc-patches

Hi!

The following testcase ICEs on aarch64-linux, because
expand_vector_condition attempts to piecewise lower SVE
  d_3 = a_1(D) < b_2(D);
  _5 = VEC_COND_EXPR ;
which isn't possible - nunits_for_known_piecewise_op ICEs but
the rest of the code assumes constant number of elements too.

expand_vector_condition attempts to find if a (rhs1) is a SSA_NAME
for comparison and calls expand_vec_cond_expr_p (type, TREE_TYPE (a1), code)
where a1 is one of the operands of the comparison and code is the comparison
code.  That one indeed isn't supported here, but what aarch64 SVE supports
are the individual statements, comparison (expand_vec_cmp_expr_p) and
expand_vec_cond_expr_p (type, TREE_TYPE (a), SSA_NAME), the latter because
that function starts with
  if (VECTOR_BOOLEAN_TYPE_P (cmp_op_type)
  && get_vcond_mask_icode (TYPE_MODE (value_type),
   TYPE_MODE (cmp_op_type)) != CODE_FOR_nothing)
return true;

In an earlier version of the patch (in the PR), we did this
  if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (a))
  && expand_vec_cond_expr_p (type, TREE_TYPE (a), ERROR_MARK))
return true;
before the code == SSA_NAME handling plus some further tweaks later.
While that fixed the ICE, it broke quite a few tests on x86 and some on
aarch64 too.  The problem is that expand_vector_comparison doesn't lower
comparisons which aren't supported and only feed VEC_COND_EXPR first operand
and expand_vector_condition succeeds for those, so with the above mentioned
change we'd verify the VEC_COND_EXPR is implementable using optab alone,
but nothing would verify the tcc_comparison which relied on
expand_vector_condition to verify.

So, the following patch instead queries whether optabs can handle the
comparison and VEC_COND_EXPR together (if a (rhs1) is a comparison;
otherwise as before it checks only the VEC_COND_EXPR) and if that fails,
also checks whether the two operations could be supported individually
and only if even that fails does the piecewise lowering.

Bootstrapped/regtested on x86_64-linux, i686-linux and aarch64-linux, ok for
trunk?

2023-03-23  Jakub Jelinek  

PR tree-optimization/109176
* tree-vect-generic.cc (expand_vector_condition): If a has
vector boolean type and is a comparison, also check if both
the comparison and VEC_COND_EXPR could be successfully expanded
individually.

* gcc.target/aarch64/sve/pr109176.c: New test.

--- gcc/tree-vect-generic.cc.jj 2023-03-21 13:28:21.354671095 +0100
+++ gcc/tree-vect-generic.cc2023-03-22 12:53:27.853986127 +0100
@@ -1063,6 +1063,15 @@ expand_vector_condition (gimple_stmt_ite
   return true;
 }
 
+  /* If a has vector boolean type and is a comparison, above
+ expand_vec_cond_expr_p might fail, even if both the comparison and
+ VEC_COND_EXPR could be supported individually.  See PR109176.  */
+  if (a_is_comparison
+  && VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (a))
+  && expand_vec_cond_expr_p (type, TREE_TYPE (a), SSA_NAME)
+  && expand_vec_cmp_expr_p (TREE_TYPE (a1), TREE_TYPE (a), code))
+return true;
+
   /* Handle vector boolean types with bitmasks.  If there is a comparison
  and we can expand the comparison into the vector boolean bitmask,
  or otherwise if it is compatible with type, we can transform
--- gcc/testsuite/gcc.target/aarch64/sve/pr109176.c.jj  2023-03-22 
12:19:21.672218631 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/pr109176.c 2023-03-22 
12:19:21.672218631 +0100
@@ -0,0 +1,12 @@
+/* PR tree-optimization/109176 */
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+
+#include 
+
+svbool_t
+foo (svint8_t a, svint8_t b, svbool_t c)
+{
+  svbool_t d = svcmplt_s8 (svptrue_pat_b8 (SV_ALL), a, b);
+  return svsel_b (d, c, d);
+}

Jakub

Re: [PATCH] RISC-V: Fix loss of function to script 'multilib-generator'

2023-03-23 Thread Kito Cheng via Gcc-patches

Nice catch, committed to the trunk!

On Tue, Mar 21, 2023 at 3:39 PM Songhe Zhu  wrote:
>
> The arch 'rv32imac' will not be created when excuting
> './multilib-generator rv32imc-ilp32--a'
>
> The output is:
> MULTILIB_OPTIONS = march=rv32imc mabi=ilp32
> MULTILIB_DIRNAMES = rv32imc ilp32
> MULTILIB_REQUIRED = march=rv32imc/mabi=ilp32
> MULTILIB_REUSE =
>
> Analysis : The alts:['rv32imc', 'rv32imac'] will change
> to ['rv32imac', 'rv32imc'] through function:unique(alts) processing,
> This is the wrong alts should not be changed.
> This patch fix it.
>
> gcc/ChangLog:
> * config/riscv/multilib-generator: Adjusting the loop of 'alt' in 
> 'alts'.
>
> Signed-off-by: Songhe Zhu 
> ---
>  gcc/config/riscv/multilib-generator | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/multilib-generator 
> b/gcc/config/riscv/multilib-generator
> index 9a6ce0223c9..0a3d4c07757 100755
> --- a/gcc/config/riscv/multilib-generator
> +++ b/gcc/config/riscv/multilib-generator
> @@ -175,7 +175,7 @@ for cmodel in cmodels:
>  # Drop duplicated entry.
>  alts = unique(alts)
>
> -for alt in alts[1:]:
> +for alt in alts:
>if alt == arch:
>  continue
>arches[alt] = 1
> --
> 2.17.1
>

Re: [PATCH] RISC-V: Bugfix for rvv bool mode size adjustment

2023-03-23 Thread Kito Cheng via Gcc-patches

committed, thanks for the reminder :)

On Mon, Mar 13, 2023 at 9:40 AM Li, Pan2 via Gcc-patches
 wrote:
>
> Kindly reminder for this PR. Thank you all in advance.
>
> Pan
>
> -Original Message-
> From: Li, Pan2
> Sent: Wednesday, March 8, 2023 7:31 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com
> Subject: RE: [PATCH] RISC-V: Bugfix for rvv bool mode size adjustment
>
> Completed the regression test and the RISC-V backend test without any 
> surprise.
>
> Pan
>
> -Original Message-
> From: Li, Pan2 
> Sent: Wednesday, March 8, 2023 3:34 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; rguent...@suse.de; Li, Pan2 
> 
> Subject: [PATCH] RISC-V: Bugfix for rvv bool mode size adjustment
>
> From: yes 
>
> Fix the bug of the rvv bool mode size by the adjustment.
> Besides the mode precision (aka bit size [1, 2, 4, 8, 16, 32, 64]) of the 
> vbool*_t, the mode size (aka byte size) will be adjusted to [1, 1, 1, 1, 2, 
> 4, 8] according to the rvv spec 1.0 isa. The adjustment will provide correct 
> information for the underlying redundant instruction elimiation.
>
> Given the below sample code:
> {
>   vbool1_t v1 = *(vbool1_t*)in;
>   vbool64_t v2 = *(vbool64_t*)in;
>
>   *(vbool1_t*)(out + 100) = v1;
>   *(vbool64_t*)(out + 200) = v2;
> }
>
> Before the size adjustment:
> csrrt0,vlenb
> sllit1,t0,1
> csrra3,vlenb
> sub sp,sp,t1
> sllia4,a3,1
> add a4,a4,sp
> addia2,a1,100
> vsetvli a5,zero,e8,m8,ta,ma
> sub a3,a4,a3
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a2)
> vsm.v   v24,0(a3)
> addia1,a1,200
> csrrt0,vlenb
> vsetvli a4,zero,e8,mf8,ta,ma
> sllit1,t0,1
> vlm.v   v24,0(a3)
> vsm.v   v24,0(a1)
> add sp,sp,t1
> jr  ra
>
> After the size adjustment:
> addia3,a1,100
> vsetvli a4,zero,e8,m8,ta,ma
> addia1,a1,200
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a3)
> vsetvli a5,zero,e8,mf8,ta,ma
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a1)
> ret
>
> Additionally, the size adjust cannot cover all possible combinations of the 
> vbool*_t code pattern like above. We will take a look into it in another 
> patches.
>
> PR 108185
> PR 108654
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-modes.def (ADJUST_BYTESIZE):
> * config/riscv/riscv.cc (riscv_v_adjust_bytesize):
> * config/riscv/riscv.h (riscv_v_adjust_bytesize):
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr108185-1.c:
> * gcc.target/riscv/rvv/base/pr108185-2.c:
> * gcc.target/riscv/rvv/base/pr108185-3.c:
>
> Signed-off-by: Pan Li 
> Co-authored-by: Ju-Zhe Zhong 
> ---
>  gcc/config/riscv/riscv-modes.def  | 14 ++--
>  gcc/config/riscv/riscv.cc | 22 +++
>  gcc/config/riscv/riscv.h  |  1 +
>  .../gcc.target/riscv/rvv/base/pr108185-1.c|  2 +-
>  .../gcc.target/riscv/rvv/base/pr108185-2.c|  2 +-
>  .../gcc.target/riscv/rvv/base/pr108185-3.c|  2 +-
>  6 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-modes.def 
> b/gcc/config/riscv/riscv-modes.def
> index 110bddce851..4cf7cf8b1c6 100644
> --- a/gcc/config/riscv/riscv-modes.def
> +++ b/gcc/config/riscv/riscv-modes.def
> @@ -64,13 +64,13 @@ ADJUST_ALIGNMENT (VNx16BI, 1);  ADJUST_ALIGNMENT 
> (VNx32BI, 1);  ADJUST_ALIGNMENT (VNx64BI, 1);
>
> -ADJUST_BYTESIZE (VNx1BI, riscv_vector_chunks * 
> riscv_bytes_per_vector_chunk); -ADJUST_BYTESIZE (VNx2BI, riscv_vector_chunks 
> * riscv_bytes_per_vector_chunk); -ADJUST_BYTESIZE (VNx4BI, 
> riscv_vector_chunks * riscv_bytes_per_vector_chunk); -ADJUST_BYTESIZE 
> (VNx8BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk); 
> -ADJUST_BYTESIZE (VNx16BI, riscv_vector_chunks * 
> riscv_bytes_per_vector_chunk); -ADJUST_BYTESIZE (VNx32BI, riscv_vector_chunks 
> * riscv_bytes_per_vector_chunk); -ADJUST_BYTESIZE (VNx64BI, 
> riscv_v_adjust_nunits (VNx64BImode, 8));
> +ADJUST_BYTESIZE (VNx1BI, riscv_v_adjust_bytesize (VNx1BImode, 1));
> +ADJUST_BYTESIZE (VNx2BI, riscv_v_adjust_bytesize (VNx2BImode, 1));
> +ADJUST_BYTESIZE (VNx4BI, riscv_v_adjust_bytesize (VNx4BImode, 1));
> +ADJUST_BYTESIZE (VNx8BI, riscv_v_adjust_bytesize (VNx8BImode, 1));
> +ADJUST_BYTESIZE (VNx16BI, riscv_v_adjust_bytesize (VNx16BImode, 2));
> +ADJUST_BYTESIZE (VNx32BI, riscv_v_adjust_bytesize (VNx32BImode, 4));
> +ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_bytesize (VNx64BImode, 8));
>
>  ADJUST_PRECISION (VNx1BI, riscv_v_adjust_precision (VNx1BImode, 1));  
> ADJUST_PRECISION (VNx2BI, riscv_v_adjust_precision (VNx2BImode, 2)); diff 
> --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 
> e7b7d87cebc..428fbb28fae 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -1003,6 +1003,28 @@ riscv_v_adjust_nunits (machine_mode mode, int scale)
>return scale;
>  }
>
> +/* Call from ADJUST_BYTESIZE in riscv-modes.def.  Return the correct
> +   BYTE size for corresponding machi

Re: [PATCH] RISC-V: costs: miscomputed shiftadd_cost triggering synth_mult [PR/108987]

2023-03-23 Thread Kito Cheng via Gcc-patches

Committed 2 weeks ago but apparently I didn't send mail to say that,
thanks Vineet.

On Thu, Mar 2, 2023 at 3:56 AM Philipp Tomsich  wrote:
>
> On Wed, 1 Mar 2023 at 20:53, Vineet Gupta  wrote:
> >
> > This showed up as dynamic icount regression in SPEC 531.deepsjeng with 
> > upstream
> > gcc (vs. gcc 12.2). gcc was resorting to synthetic multiply using 
> > shift+add(s)
> > even when multiply had clear cost benefit.
> >
> > |000133b8  > .constprop.0]+0x382>:
> > |   133b8:  srl a3,a1,s6
> > |   133bc:  and a3,a3,s5
> > |   133c0:  sllia4,a3,0x9
> > |   133c4:  add a4,a4,a3
> > |   133c6:  sllia4,a4,0x9
> > |   133c8:  add a4,a4,a3
> > |   133ca:  sllia3,a4,0x1b
> > |   133ce:  add a4,a4,a3
> >
> > vs. gcc 12 doing something lke below.
> >
> > |000131c4  > .constprop.0]+0x35c>:
> > |   131c4:  ld  s1,8(sp)
> > |   131c6:  srl a3,a1,s4
> > |   131ca:  and a3,a3,s11
> > |   131ce:  mul a3,a3,s1
> >
> > Bisected this to f90cb39235c4 ("RISC-V: costs: support shift-and-add in
> > strength-reduction"). The intent was to optimize cost for
> > shift-add-pow2-{1,2,3} corresponding to bitmanip insns SH*ADD, but ended
> > up doing that for all shift values which seems to favor synthezing
> > multiply among others.
> >
> > The bug itself is trivial, IN_RANGE() calling pow2p_hwi() which returns bool
> > vs. exact_log2() returning power of 2.
> >
> > This fix also requires update to the test introduced by the same commit
> > which now generates MUL vs. synthesizing it.
> >
> > gcc/Changelog:
> >
> > * config/riscv/riscv.cc (riscv_rtx_costs): Fixed IN_RANGE() to
> >   use exact_log2().
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/zba-shNadd-07.c: f2(i*783) now generates MUL vs.
> >   5 insn sh1add+slli+add+slli+sub.
> > * gcc.target/riscv/pr108987.c: New test.
> >
> > Signed-off-by: Vineet Gupta 
>
> Reviewed-by: Philipp Tomsich

Re: [PATCH] tree-vect-generic: Fix up expand_vector_condition [PR109176]

2023-03-23 Thread Richard Biener via Gcc-patches

On Thu, 23 Mar 2023, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs on aarch64-linux, because
> expand_vector_condition attempts to piecewise lower SVE
>   d_3 = a_1(D) < b_2(D);
>   _5 = VEC_COND_EXPR ;
> which isn't possible - nunits_for_known_piecewise_op ICEs but
> the rest of the code assumes constant number of elements too.
> 
> expand_vector_condition attempts to find if a (rhs1) is a SSA_NAME
> for comparison and calls expand_vec_cond_expr_p (type, TREE_TYPE (a1), code)
> where a1 is one of the operands of the comparison and code is the comparison
> code.  That one indeed isn't supported here, but what aarch64 SVE supports
> are the individual statements, comparison (expand_vec_cmp_expr_p) and
> expand_vec_cond_expr_p (type, TREE_TYPE (a), SSA_NAME), the latter because
> that function starts with
>   if (VECTOR_BOOLEAN_TYPE_P (cmp_op_type)
>   && get_vcond_mask_icode (TYPE_MODE (value_type),
>TYPE_MODE (cmp_op_type)) != CODE_FOR_nothing)
> return true;
> 
> In an earlier version of the patch (in the PR), we did this
>   if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (a))
>   && expand_vec_cond_expr_p (type, TREE_TYPE (a), ERROR_MARK))
> return true;
> before the code == SSA_NAME handling plus some further tweaks later.
> While that fixed the ICE, it broke quite a few tests on x86 and some on
> aarch64 too.  The problem is that expand_vector_comparison doesn't lower
> comparisons which aren't supported and only feed VEC_COND_EXPR first operand
> and expand_vector_condition succeeds for those, so with the above mentioned
> change we'd verify the VEC_COND_EXPR is implementable using optab alone,
> but nothing would verify the tcc_comparison which relied on
> expand_vector_condition to verify.

Ah, indeed - all a bit twisty.

> So, the following patch instead queries whether optabs can handle the
> comparison and VEC_COND_EXPR together (if a (rhs1) is a comparison;
> otherwise as before it checks only the VEC_COND_EXPR) and if that fails,
> also checks whether the two operations could be supported individually
> and only if even that fails does the piecewise lowering.
> 
> Bootstrapped/regtested on x86_64-linux, i686-linux and aarch64-linux, ok for
> trunk?

OK.

Thanks for digging into it.

Richard.

> 2023-03-23  Jakub Jelinek  
> 
>   PR tree-optimization/109176
>   * tree-vect-generic.cc (expand_vector_condition): If a has
>   vector boolean type and is a comparison, also check if both
>   the comparison and VEC_COND_EXPR could be successfully expanded
>   individually.
> 
>   * gcc.target/aarch64/sve/pr109176.c: New test.
> 
> --- gcc/tree-vect-generic.cc.jj   2023-03-21 13:28:21.354671095 +0100
> +++ gcc/tree-vect-generic.cc  2023-03-22 12:53:27.853986127 +0100
> @@ -1063,6 +1063,15 @@ expand_vector_condition (gimple_stmt_ite
>return true;
>  }
>  
> +  /* If a has vector boolean type and is a comparison, above
> + expand_vec_cond_expr_p might fail, even if both the comparison and
> + VEC_COND_EXPR could be supported individually.  See PR109176.  */
> +  if (a_is_comparison
> +  && VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (a))
> +  && expand_vec_cond_expr_p (type, TREE_TYPE (a), SSA_NAME)
> +  && expand_vec_cmp_expr_p (TREE_TYPE (a1), TREE_TYPE (a), code))
> +return true;
> +
>/* Handle vector boolean types with bitmasks.  If there is a comparison
>   and we can expand the comparison into the vector boolean bitmask,
>   or otherwise if it is compatible with type, we can transform
> --- gcc/testsuite/gcc.target/aarch64/sve/pr109176.c.jj2023-03-22 
> 12:19:21.672218631 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/pr109176.c   2023-03-22 
> 12:19:21.672218631 +0100
> @@ -0,0 +1,12 @@
> +/* PR tree-optimization/109176 */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include 
> +
> +svbool_t
> +foo (svint8_t a, svint8_t b, svbool_t c)
> +{
> +  svbool_t d = svcmplt_s8 (svptrue_pat_b8 (SV_ALL), a, b);
> +  return svsel_b (d, c, d);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH] libstdc++: use __bool_constant instead of integral_constant

2023-03-23 Thread Jonathan Wakely via Gcc-patches

On Thu, 23 Mar 2023 at 02:06, Ken Matsui via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> In the type_traits header, both integral_constant and __bool_constant
> are used.


Yes, this is just because we didn't have __bool_constant originally, and
nobody has needed to touch the traits that still use integral_constant, so
they never got updated.



> This patch unifies those usages into __bool_constant.
>

Thanks, doing this for consistency seems reasonable, and safe to do now
instead of waiting until after the GCC 13 release. I'll test and push the
patch.

Do you have a GCC copyright assignment on file with the FSF?
If not, either you need to complete that paperwork, or add a DCO sign-off
to all your patches:
https://gcc.gnu.org/dco.html


>
> libstdc++-v3/ChangeLog:
>
> * include/std/type_traits: Use __bool_constant instead of
> integral_constant.
> ---
>  libstdc++-v3/include/std/type_traits | 32 ++--
>  1 file changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/type_traits
> b/libstdc++-v3/include/std/type_traits
> index 2bd607a8b8f..bc6982f9e64 100644
> --- a/libstdc++-v3/include/std/type_traits
> +++ b/libstdc++-v3/include/std/type_traits
> @@ -578,19 +578,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>/// is_enum
>template
>  struct is_enum
> -: public integral_constant
> +: public __bool_constant<__is_enum(_Tp)>
>  { };
>
>/// is_union
>template
>  struct is_union
> -: public integral_constant
> +: public __bool_constant<__is_union(_Tp)>
>  { };
>
>/// is_class
>template
>  struct is_class
> -: public integral_constant
> +: public __bool_constant<__is_class(_Tp)>
>  { };
>
>/// is_function
> @@ -784,7 +784,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>/// is_trivial
>template
>  struct is_trivial
> -: public integral_constant
> +: public __bool_constant<__is_trivial(_Tp)>
>  {
>
>  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
> "template argument must be a complete class or an unbounded
> array");
> @@ -793,7 +793,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>/// is_trivially_copyable
>template
>  struct is_trivially_copyable
> -: public integral_constant
> +: public __bool_constant<__is_trivially_copyable(_Tp)>
>  {
>
>  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
> "template argument must be a complete class or an unbounded
> array");
> @@ -802,7 +802,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>/// is_standard_layout
>template
>  struct is_standard_layout
> -: public integral_constant
> +: public __bool_constant<__is_standard_layout(_Tp)>
>  {
>
>  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
> "template argument must be a complete class or an unbounded
> array");
> @@ -817,7 +817,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  struct
>  _GLIBCXX20_DEPRECATED_SUGGEST("is_standard_layout && is_trivial")
>  is_pod
> -: public integral_constant
> +: public __bool_constant<__is_pod(_Tp)>
>  {
>
>  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
> "template argument must be a complete class or an unbounded
> array");
> @@ -831,7 +831,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  struct
>  _GLIBCXX17_DEPRECATED
>  is_literal_type
> -: public integral_constant
> +: public __bool_constant<__is_literal_type(_Tp)>
>  {
>
>  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
> "template argument must be a complete class or an unbounded
> array");
> @@ -840,13 +840,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>/// is_empty
>template
>  struct is_empty
> -: public integral_constant
> +: public __bool_constant<__is_empty(_Tp)>
>  { };
>
>/// is_polymorphic
>template
>  struct is_polymorphic
> -: public integral_constant
> +: public __bool_constant<__is_polymorphic(_Tp)>
>  { };
>
>  #if __cplusplus >= 201402L
> @@ -855,14 +855,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>/// @since C++14
>template
>  struct is_final
> -: public integral_constant
> +: public __bool_constant<__is_final(_Tp)>
>  { };
>  #endif
>
>/// is_abstract
>template
>  struct is_abstract
> -: public integral_constant
> +: public __bool_constant<__is_abstract(_Tp)>
>  { };
>
>/// @cond undocumented
> @@ -873,7 +873,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>template
>  struct __is_signed_helper<_Tp, true>
> -: public integral_constant
> +: public __bool_constant<_Tp(-1) < _Tp(0)>
>  { };
>/// @endcond
>
> @@ -1333,7 +1333,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>/// has_virtual_destructor
>template
>  struct has_virtual_destructor
> -: public integral_constant
> +: public __bool_constant<__has_virtual_destructor(_Tp)>
>  {
>
>

[Patch,v4] Fortran/OpenMP: Fix mapping of array descriptors and deferred-length strings

2023-03-23 Thread Tobias Burnus


[GCC 13 vs GCC 14]
I am unsure whether this should still go to GCC 13 or not. It is somewhat larger
albeit well contained (Fortran, only OpenMP, ...) and fixes real-world bugs,
but it is not a regression - and we are meanwhile slowly approaching the 
release.

An alternative would be to go to GCC 14 and then be eventually backported to 
GCC 14
(albeit I am not sure whether early testing would be better). Or just to GCC 
14, hmm.

Thoughts?

* * *

Another update - fixing an independent issue which makes sense to be part of 
this
patch.

Allocatable/pointer scalars are currently mapped as:

 #pragma omp target enter data map(to:*var.1_1 [len: 4]) map(alloc:var [pointer 
assign, bias: 0])
 #pragma omp target exit data map(from:*var.2_2 [len: 4])

where 'GOMP_MAP_POINTER' is removed in gimplify.cc. In v3 (and v4) of this 
patch,
this kind of handling moved from gimplify.cc to fortran/trans-openmp.cc; 
however,
v3 has the same problem. For allocatable arrays, we have PSET + POINTER and
the PSET part is changed/set to RELEASE/DELETE for 'exit data'

But for scalars, the map was still left on the stack. Besides having a stale 
map,
this could lead to fails when the stack was popped, especially when attempting
to later map another stack variable with the same stack address, partially
overlapping with the stale POINTER.

Side remark:
I found this for testcase that is part of an upcoming deep-mapping follow-up 
patch;
that test failed with -O1 but worked with -O0/-Og due to changed stack usage.
(Deep-mapping of allocatable components is on the OG12 branch; it is scheduled
for mainline integration after stage1 opened.)


The updated mainline patch is included; map-10.f90 is the new testcase.
If anyone wants to see it separately, the patch has been committed to OG12 as
https://gcc.gnu.org/g:8ea805840200f7dfd2c11b37abf5fbfe479c2fe2

Comments/thoughts/remarks to this patch?

Tobias

PS: For the rest of the patch, see a short description below - or with some 
longer
remarks previous in this thread.

On 27.02.23 13:15, Tobias Burnus wrote:

And another re-diff for GCC 13/mainline, updating gcc/testsuite/

(The last change is related to the "[OG12,committed] Update dg-dump-scan
for ..." discussion + OG12 https://gcc.gnu.org/g:e4de87a2309 /
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612871.html )

On 23.02.23 17:42, Tobias Burnus wrote:

On 21.02.23 12:57, Tobias Burnus wrote:

This patch moves some generic code for Fortran out of gimplify.cc
to trans-openmp.cc and fixes several issues related to mapping.

Tested with nvptx offloading.
OK for mainline?

Tobias

Caveats:

Besides the issues shown in the comment-out code, there remains also an
issue with implicit mapping - at least for deferred-length strings,
but I wouldn't be surprised if - at least depending on the used
'defaultmap' value (e.g. 'alloc') - there are also issues with array
descriptors.

Note:

Regarding the declare target check for mapping: Without declare
target, my assumption is that the hidden length variable will
get implicitly mapped if needed. Independent of deferred-length
or not, there is probably an issue with 'defaultmap(none)' and
the hidden variable. - In any case, I prefer to defer all those
issues to later (by having them captured in one/several PR).


Tobias

PS: This patch is a follow up to
  [Patch] Fortran/OpenMP: Fix DT struct-component with 'alloc' and
array descr
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604887.html
which fixed part of the problems. But as discussed on IRC, it did
treat 'alloc'
as special and missed some other map types. - In addition, this patch
has a
much extended test coverage and fixes some more issues found that way.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran/OpenMP: Fix mapping of array descriptors and deferred-length strings

Previously, array descriptors might have been mapped as 'alloc'
instead of 'to' for 'alloc', not updating the array bounds. The
'alloc' could also appear for 'data exit', failing with a libgomp
assert. In some cases, either array descriptors or deferred-length
string's length variable was not mapped. And, finally, some offset
calculations with array-sections mappings went wrong.

Additionally, the patch now unmaps for scalar allocatables/pointers
the GOMP_MAP_POINTER, avoiding stale mappings.

The testcases contain some comment-out tests which require follow-up
work and for which PR exist. Those mostly relate to deferred-length
strings which have several issues beyong OpenMP support.

gcc/fortran/ChangeLog:

	* trans-decl.cc (gfc_get_symbol_decl): Add attributes
	such as 'declare target' also to hidden artificial
	variable for deferred-length character variables.
	* trans-openmp.cc (gfc_trans_omp_array_section,
	gfc_trans_omp_clauses, gfc_

RE: [PATCH] RISC-V: Bugfix for rvv bool mode size adjustment

2023-03-23 Thread Li, Pan2 via Gcc-patches

Great! Thank you ;)

-Original Message-
From: Kito Cheng  
Sent: Thursday, March 23, 2023 4:41 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com
Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode size adjustment

committed, thanks for the reminder :)

On Mon, Mar 13, 2023 at 9:40 AM Li, Pan2 via Gcc-patches 
 wrote:
>
> Kindly reminder for this PR. Thank you all in advance.
>
> Pan
>
> -Original Message-
> From: Li, Pan2
> Sent: Wednesday, March 8, 2023 7:31 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com
> Subject: RE: [PATCH] RISC-V: Bugfix for rvv bool mode size adjustment
>
> Completed the regression test and the RISC-V backend test without any 
> surprise.
>
> Pan
>
> -Original Message-
> From: Li, Pan2 
> Sent: Wednesday, March 8, 2023 3:34 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; rguent...@suse.de; 
> Li, Pan2 
> Subject: [PATCH] RISC-V: Bugfix for rvv bool mode size adjustment
>
> From: yes 
>
> Fix the bug of the rvv bool mode size by the adjustment.
> Besides the mode precision (aka bit size [1, 2, 4, 8, 16, 32, 64]) of the 
> vbool*_t, the mode size (aka byte size) will be adjusted to [1, 1, 1, 1, 2, 
> 4, 8] according to the rvv spec 1.0 isa. The adjustment will provide correct 
> information for the underlying redundant instruction elimiation.
>
> Given the below sample code:
> {
>   vbool1_t v1 = *(vbool1_t*)in;
>   vbool64_t v2 = *(vbool64_t*)in;
>
>   *(vbool1_t*)(out + 100) = v1;
>   *(vbool64_t*)(out + 200) = v2;
> }
>
> Before the size adjustment:
> csrrt0,vlenb
> sllit1,t0,1
> csrra3,vlenb
> sub sp,sp,t1
> sllia4,a3,1
> add a4,a4,sp
> addia2,a1,100
> vsetvli a5,zero,e8,m8,ta,ma
> sub a3,a4,a3
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a2)
> vsm.v   v24,0(a3)
> addia1,a1,200
> csrrt0,vlenb
> vsetvli a4,zero,e8,mf8,ta,ma
> sllit1,t0,1
> vlm.v   v24,0(a3)
> vsm.v   v24,0(a1)
> add sp,sp,t1
> jr  ra
>
> After the size adjustment:
> addia3,a1,100
> vsetvli a4,zero,e8,m8,ta,ma
> addia1,a1,200
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a3)
> vsetvli a5,zero,e8,mf8,ta,ma
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a1)
> ret
>
> Additionally, the size adjust cannot cover all possible combinations of the 
> vbool*_t code pattern like above. We will take a look into it in another 
> patches.
>
> PR 108185
> PR 108654
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-modes.def (ADJUST_BYTESIZE):
> * config/riscv/riscv.cc (riscv_v_adjust_bytesize):
> * config/riscv/riscv.h (riscv_v_adjust_bytesize):
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr108185-1.c:
> * gcc.target/riscv/rvv/base/pr108185-2.c:
> * gcc.target/riscv/rvv/base/pr108185-3.c:
>
> Signed-off-by: Pan Li 
> Co-authored-by: Ju-Zhe Zhong 
> ---
>  gcc/config/riscv/riscv-modes.def  | 14 ++--
>  gcc/config/riscv/riscv.cc | 22 +++
>  gcc/config/riscv/riscv.h  |  1 +
>  .../gcc.target/riscv/rvv/base/pr108185-1.c|  2 +-
>  .../gcc.target/riscv/rvv/base/pr108185-2.c|  2 +-
>  .../gcc.target/riscv/rvv/base/pr108185-3.c|  2 +-
>  6 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-modes.def 
> b/gcc/config/riscv/riscv-modes.def
> index 110bddce851..4cf7cf8b1c6 100644
> --- a/gcc/config/riscv/riscv-modes.def
> +++ b/gcc/config/riscv/riscv-modes.def
> @@ -64,13 +64,13 @@ ADJUST_ALIGNMENT (VNx16BI, 1);  ADJUST_ALIGNMENT 
> (VNx32BI, 1);  ADJUST_ALIGNMENT (VNx64BI, 1);
>
> -ADJUST_BYTESIZE (VNx1BI, riscv_vector_chunks * 
> riscv_bytes_per_vector_chunk); -ADJUST_BYTESIZE (VNx2BI, 
> riscv_vector_chunks * riscv_bytes_per_vector_chunk); -ADJUST_BYTESIZE 
> (VNx4BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk); 
> -ADJUST_BYTESIZE (VNx8BI, riscv_vector_chunks * 
> riscv_bytes_per_vector_chunk); -ADJUST_BYTESIZE (VNx16BI, 
> riscv_vector_chunks * riscv_bytes_per_vector_chunk); -ADJUST_BYTESIZE 
> (VNx32BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk); 
> -ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_nunits (VNx64BImode, 8));
> +ADJUST_BYTESIZE (VNx1BI, riscv_v_adjust_bytesize (VNx1BImode, 1)); 
> +ADJUST_BYTESIZE (VNx2BI, riscv_v_adjust_bytesize (VNx2BImode, 1)); 
> +ADJUST_BYTESIZE (VNx4BI, riscv_v_adjust_bytesize (VNx4BImode, 1)); 
> +ADJUST_BYTESIZE (VNx8BI, riscv_v_adjust_bytesize (VNx8BImode, 1)); 
> +ADJUST_BYTESIZE (VNx16BI, riscv_v_adjust_bytesize (VNx16BImode, 2)); 
> +ADJUST_BYTESIZE (VNx32BI, riscv_v_adjust_bytesize (VNx32BImode, 4)); 
> +ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_bytesize (VNx64BImode, 8));
>
>  ADJUST_PRECISION (VNx1BI, riscv_v_adjust_precision (VNx1BImode, 1));  
> ADJUST_PRECISION (VNx2BI, riscv_v_adjust_precision (VNx2BImode, 2)); 
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc 
> index e7b7d87cebc..428fbb28fae 100644
> --- a/gcc/

[PING] Re: [PATCH 1/2] ivopts: Revert computation of address cost complexity.

2023-03-23 Thread Jovan Dmitrovic

Ping for patch from December 2022: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608895.html

[PATCH] ipa: Avoid another ICE when dealing with type-incompatibilities (PR 108959)

2023-03-23 Thread Martin Jambor

Hi,

PR 108959 shows one more example where undefined code with type
incompatible accesses to stuff passed in parameters can cause an ICE
because we try to create a VIEW_CONVERT_EXPR of mismatching sizes:

1. IPA-CP tries to push one type from above,

2. IPA-SRA (correctly) decides the parameter is useless because it is
   only used to construct an argument to another function which does not
   use it and so the formal parameter should be removed,

3. but the code reconciling IPA-CP and IPA-SRA transformations still
   wants to perform the IPA-CP and overrides the built-in DCE of
   useless statements and tries to stuff constants into them
   instead, constants of a type with mismatching type and size.

This patch avoids the situation in IPA-SRA by purging the IPA-CP
results from any "aggregate" constants that are passed in parameters
which are detected to be useless.  It also removes IPA value range and
bits information associated with removed parameters stored in the same
structure so that the useless information is not streamed later on.

Bootstrapped and LTO-bootstrapped and tested on x86_64-linux.  OK for
trunk?

Thanks,

Martin



gcc/ChangeLog:

2023-03-22  Martin Jambor  

PR ipa/108959
* ipa-sra.cc (zap_useless_ipcp_results): New function.
(process_isra_node_results): Call it.

gcc/testsuite/ChangeLog:

2023-03-17  Martin Jambor  

PR ipa/108959
* gcc.dg/ipa/pr108959.c: New test.
---
 gcc/ipa-sra.cc  | 66 +
 gcc/testsuite/gcc.dg/ipa/pr108959.c | 22 ++
 2 files changed, 88 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr108959.c

diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index 3de7d426b7e..7b8260bc9e1 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -4028,6 +4028,70 @@ mark_callers_calls_comdat_local (struct cgraph_node 
*node, void *)
   return false;
 }
 
+/* Remove any IPA-CP results stored in TS that are associated with removed
+   parameters as marked in IFS. */
+
+static void
+zap_useless_ipcp_results (const isra_func_summary *ifs, ipcp_transformation 
*ts)
+{
+  unsigned ts_len = vec_safe_length (ts->m_agg_values);
+
+  if (ts_len == 0)
+return;
+
+  bool removed_item = false;
+  unsigned dst_index = 0;
+
+  for (unsigned i = 0; i < ts_len; i++)
+{
+  ipa_argagg_value *v = &(*ts->m_agg_values)[i];
+  const isra_param_desc *desc = &(*ifs->m_parameters)[v->index];
+
+  if (!desc->locally_unused)
+   {
+ if (removed_item)
+   (*ts->m_agg_values)[dst_index] = *v;
+ dst_index++;
+   }
+  else
+   removed_item = true;
+}
+  if (dst_index == 0)
+{
+  ggc_free (ts->m_agg_values);
+  ts->m_agg_values = NULL;
+}
+  else if (removed_item)
+ts->m_agg_values->truncate (dst_index);
+
+  bool useful_bits = false;
+  unsigned count = vec_safe_length (ts->bits);
+  for (unsigned i = 0; i < count; i++)
+if ((*ts->bits)[i])
+{
+  const isra_param_desc *desc = &(*ifs->m_parameters)[i];
+  if (desc->locally_unused)
+   (*ts->bits)[i] = NULL;
+  else
+   useful_bits = true;
+}
+  if (!useful_bits)
+ts->bits = NULL;
+
+  bool useful_vr = false;
+  count = vec_safe_length (ts->m_vr);
+  for (unsigned i = 0; i < count; i++)
+if ((*ts->m_vr)[i].known)
+  {
+   const isra_param_desc *desc = &(*ifs->m_parameters)[i];
+   if (desc->locally_unused)
+ (*ts->m_vr)[i].known = false;
+   else
+ useful_vr = true;
+  }
+  if (!useful_vr)
+ts->m_vr = NULL;
+}
 
 /* Do final processing of results of IPA propagation regarding NODE, clone it
if appropriate.  */
@@ -4080,6 +4144,8 @@ process_isra_node_results (cgraph_node *node,
 }
 
   ipcp_transformation *ipcp_ts = ipcp_get_transformation_summary (node);
+  if (ipcp_ts)
+zap_useless_ipcp_results (ifs, ipcp_ts);
   vec *new_params = NULL;
   if (ipa_param_adjustments *old_adjustments
 = cinfo ? cinfo->param_adjustments : NULL)
diff --git a/gcc/testsuite/gcc.dg/ipa/pr108959.c 
b/gcc/testsuite/gcc.dg/ipa/pr108959.c
new file mode 100644
index 000..cd1f88658ef
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr108959.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+union U2 {
+  long f0;
+  int f1;
+};
+int g_16;
+int g_70[20];
+static int func_61(int) {
+  for (;;)
+g_70[g_16] = 4;
+}
+static int func_43(int *p_44)
+{
+  func_61(*p_44);
+}
+int main() {
+  union U2 l_38 = {9};
+  int *l_49 = (int *) &l_38;
+  func_43(l_49);
+}
-- 
2.40.0

[PING] Re: [PATCH 2/2] ivopts: Revert register pressure cost when there are enough registers.

2023-03-23 Thread Jovan Dmitrovic

Ping for patch from December 2022: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608896.html

Re: [PATCH] libstdc++: use __bool_constant instead of integral_constant

2023-03-23 Thread Ken Matsui via Gcc-patches

Thank you so much for your review!

This is my first time contributing to GCC, so I do not have a GCC
copyright assignment. I googled those two ways, but I am still
confused... Is it correct that the DCO sign-off has been getting more
common recently? If so, I will put the sign-off into all my patches. I
would prefer to choose the more common and lightweight way.

Sincerely,
Ken Matsui

On Thu, Mar 23, 2023 at 2:28 AM Jonathan Wakely  wrote:
>
>
>
> On Thu, 23 Mar 2023 at 02:06, Ken Matsui via Libstdc++ 
>  wrote:
>>
>> In the type_traits header, both integral_constant and __bool_constant
>> are used.
>
>
> Yes, this is just because we didn't have __bool_constant originally, and 
> nobody has needed to touch the traits that still use integral_constant, so 
> they never got updated.
>
>
>>
>> This patch unifies those usages into __bool_constant.
>
>
> Thanks, doing this for consistency seems reasonable, and safe to do now 
> instead of waiting until after the GCC 13 release. I'll test and push the 
> patch.
>
> Do you have a GCC copyright assignment on file with the FSF?
> If not, either you need to complete that paperwork, or add a DCO sign-off to 
> all your patches:
> https://gcc.gnu.org/dco.html
>
>>
>>
>> libstdc++-v3/ChangeLog:
>>
>> * include/std/type_traits: Use __bool_constant instead of
>> integral_constant.
>> ---
>>  libstdc++-v3/include/std/type_traits | 32 ++--
>>  1 file changed, 16 insertions(+), 16 deletions(-)
>>
>> diff --git a/libstdc++-v3/include/std/type_traits 
>> b/libstdc++-v3/include/std/type_traits
>> index 2bd607a8b8f..bc6982f9e64 100644
>> --- a/libstdc++-v3/include/std/type_traits
>> +++ b/libstdc++-v3/include/std/type_traits
>> @@ -578,19 +578,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>/// is_enum
>>template
>>  struct is_enum
>> -: public integral_constant
>> +: public __bool_constant<__is_enum(_Tp)>
>>  { };
>>
>>/// is_union
>>template
>>  struct is_union
>> -: public integral_constant
>> +: public __bool_constant<__is_union(_Tp)>
>>  { };
>>
>>/// is_class
>>template
>>  struct is_class
>> -: public integral_constant
>> +: public __bool_constant<__is_class(_Tp)>
>>  { };
>>
>>/// is_function
>> @@ -784,7 +784,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>/// is_trivial
>>template
>>  struct is_trivial
>> -: public integral_constant
>> +: public __bool_constant<__is_trivial(_Tp)>
>>  {
>>static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
>> "template argument must be a complete class or an unbounded array");
>> @@ -793,7 +793,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>/// is_trivially_copyable
>>template
>>  struct is_trivially_copyable
>> -: public integral_constant
>> +: public __bool_constant<__is_trivially_copyable(_Tp)>
>>  {
>>static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
>> "template argument must be a complete class or an unbounded array");
>> @@ -802,7 +802,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>/// is_standard_layout
>>template
>>  struct is_standard_layout
>> -: public integral_constant
>> +: public __bool_constant<__is_standard_layout(_Tp)>
>>  {
>>static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
>> "template argument must be a complete class or an unbounded array");
>> @@ -817,7 +817,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>  struct
>>  _GLIBCXX20_DEPRECATED_SUGGEST("is_standard_layout && is_trivial")
>>  is_pod
>> -: public integral_constant
>> +: public __bool_constant<__is_pod(_Tp)>
>>  {
>>static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
>> "template argument must be a complete class or an unbounded array");
>> @@ -831,7 +831,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>  struct
>>  _GLIBCXX17_DEPRECATED
>>  is_literal_type
>> -: public integral_constant
>> +: public __bool_constant<__is_literal_type(_Tp)>
>>  {
>>static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
>> "template argument must be a complete class or an unbounded array");
>> @@ -840,13 +840,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>/// is_empty
>>template
>>  struct is_empty
>> -: public integral_constant
>> +: public __bool_constant<__is_empty(_Tp)>
>>  { };
>>
>>/// is_polymorphic
>>template
>>  struct is_polymorphic
>> -: public integral_constant
>> +: public __bool_constant<__is_polymorphic(_Tp)>
>>  { };
>>
>>  #if __cplusplus >= 201402L
>> @@ -855,14 +855,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>/// @since C++14
>>template
>>  struct is_final
>> -: public integral_constant
>> +: public __bool_constant<__is_final(_Tp)>
>>  { };
>>  #endif
>>
>>/// is_abstract
>>template
>>  struct is_abstract
>> -: publ

[PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Ajit Agarwal via Gcc-patches



Hello All:

This patch removed unnecessary signed extension elimination in ree pass.
Bootstrapped and regtested on powerpc64-linux-gnu.


Thanks & Regards
Ajit

rtl-optimization: ppc backend generates unnecessary signed extension.

Eliminate unnecessary redundant signed extension.

2023-03-23  Ajit Kumar Agarwal  

gcc/ChangeLog:

* ree.cc: Modification for  AND opcode support to eliminate
unnecessary signed extension.
* testsuite/g++.target/powerpc/sext-elim.C: New tests.
---
 gcc/ree.cc   | 24 +---
 gcc/testsuite/g++.target/powerpc/sext-elim.C | 19 
 2 files changed, 40 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C

diff --git a/gcc/ree.cc b/gcc/ree.cc
index d09f55149b1..63d8cf9f237 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -364,6 +364,7 @@ combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, 
rtx *orig_set)
   rtx simplified_temp_extension = simplify_rtx (temp_extension);
   if (simplified_temp_extension)
 temp_extension = simplified_temp_extension;
+
   new_set = gen_rtx_SET (new_reg, temp_extension);
 }
   else if (GET_CODE (orig_src) == IF_THEN_ELSE)
@@ -375,11 +376,21 @@ combine_set_extension (ext_cand *cand, rtx_insn 
*curr_insn, rtx *orig_set)
   else
 {
   /* This is the normal case.  */
-  rtx temp_extension
-   = gen_rtx_fmt_e (cand->code, cand->mode, orig_src);
+  rtx temp_extension = NULL_RTX;
+
+  if (GET_CODE (SET_SRC (cand_pat)) == AND)
+   temp_extension
+   = gen_rtx_fmt_ee (cand->code, cand->mode,orig_src,
+ XEXP (SET_SRC (cand_pat), 1));
+  else
+   temp_extension
+   = gen_rtx_fmt_e (cand->code, cand->mode,orig_src);
+
   rtx simplified_temp_extension = simplify_rtx (temp_extension);
+
   if (simplified_temp_extension)
 temp_extension = simplified_temp_extension;
+
   new_set = gen_rtx_SET (new_reg, temp_extension);
 }
 
@@ -1047,7 +1058,14 @@ combine_reaching_defs (ext_cand *cand, const_rtx 
set_pat, ext_state *state)
 cannot be merged, we entirely give up.  In the future, we should allow
 extensions to be partially eliminated along those paths where the
 definitions could be merged.  */
-  if (apply_change_group ())
+   int num_clobbers = 0;
+   int icode = recog (cand->insn, cand->insn,
+ (GET_CODE (cand->expr) == SET
+  && ! reload_completed
+  && ! reload_in_progress)
+  ? &num_clobbers : 0);
+
+  if (apply_change_group () || (icode < 0))
 {
   if (dump_file)
 fprintf (dump_file, "All merges were successful.\n");
diff --git a/gcc/testsuite/g++.target/powerpc/sext-elim.C 
b/gcc/testsuite/g++.target/powerpc/sext-elim.C
new file mode 100644
index 000..1180b9ce268
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/sext-elim.C
@@ -0,0 +1,19 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -free" } */
+
+unsigned long c2l(unsigned char* p)
+{
+  unsigned long res = *p + *(p+1);
+  return res;
+}
+
+long c2sl(signed char* p)
+{
+  long res = *p + *(p+1);
+  return res;
+}
+
+/* { dg-final { scan-assembler-not "rldicl" } } */
+/* { dg-final { scan-assembler-not "extsw" } } */
-- 
2.31.1

Re: [PATCH] libstdc++: use __bool_constant instead of integral_constant

2023-03-23 Thread Ville Voutilainen via Gcc-patches

On Thu, 23 Mar 2023 at 12:18, Ken Matsui via Libstdc++
 wrote:
>
> Thank you so much for your review!
>
> This is my first time contributing to GCC, so I do not have a GCC
> copyright assignment. I googled those two ways, but I am still
> confused... Is it correct that the DCO sign-off has been getting more
> common recently? If so, I will put the sign-off into all my patches. I
> would prefer to choose the more common and lightweight way.

DCO sign-off is indeed more light-weight, and sure, it's becoming more common
since it's relatively new as an option.

Re: [PATCH] libstdc++: use __bool_constant instead of integral_constant

2023-03-23 Thread Ken Matsui via Gcc-patches

On Thu, Mar 23, 2023 at 3:46 AM Ville Voutilainen
 wrote:
>
> On Thu, 23 Mar 2023 at 12:18, Ken Matsui via Libstdc++
>  wrote:
> >
> > Thank you so much for your review!
> >
> > This is my first time contributing to GCC, so I do not have a GCC
> > copyright assignment. I googled those two ways, but I am still
> > confused... Is it correct that the DCO sign-off has been getting more
> > common recently? If so, I will put the sign-off into all my patches. I
> > would prefer to choose the more common and lightweight way.
>
> DCO sign-off is indeed more light-weight, and sure, it's becoming more common
> since it's relatively new as an option.

Thank you!

To add a DCO sign-off, do I need to bump up the subject line to [PATCH v2]?

Re: [PATCH] libstdc++: use __bool_constant instead of integral_constant

2023-03-23 Thread Ville Voutilainen via Gcc-patches

On Thu, 23 Mar 2023 at 12:53, Ken Matsui  wrote:

> > DCO sign-off is indeed more light-weight, and sure, it's becoming more 
> > common
> > since it's relatively new as an option.
>
> Thank you!
>
> To add a DCO sign-off, do I need to bump up the subject line to [PATCH v2]?

No. The format of the subject for patch submission emails is not that strict. :)

Re: [PATCH] libstdc++: use __bool_constant instead of integral_constant

2023-03-23 Thread Ken Matsui via Gcc-patches

On Thu, Mar 23, 2023 at 3:56 AM Ville Voutilainen
 wrote:
>
> On Thu, 23 Mar 2023 at 12:53, Ken Matsui  wrote:
>
> > > DCO sign-off is indeed more light-weight, and sure, it's becoming more 
> > > common
> > > since it's relatively new as an option.
> >
> > Thank you!
> >
> > To add a DCO sign-off, do I need to bump up the subject line to [PATCH v2]?
>
> No. The format of the subject for patch submission emails is not that strict. 
> :)

I see. I will update this patch. Thank you so much for your support!

Sincerely,
Ken Matsui

[PATCH] libstdc++: use __bool_constant instead of integral_constant

2023-03-23 Thread Ken Matsui via Gcc-patches

In the type_traits header, both integral_constant and __bool_constant
are used. This patch unifies those usages into __bool_constant.

libstdc++-v3/ChangeLog:

* include/std/type_traits: Use __bool_constant instead of
integral_constant.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 32 ++--
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 2bd607a8b8f..bc6982f9e64 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -578,19 +578,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_enum
   template
 struct is_enum
-: public integral_constant
+: public __bool_constant<__is_enum(_Tp)>
 { };
 
   /// is_union
   template
 struct is_union
-: public integral_constant
+: public __bool_constant<__is_union(_Tp)>
 { };
 
   /// is_class
   template
 struct is_class
-: public integral_constant
+: public __bool_constant<__is_class(_Tp)>
 { };
 
   /// is_function
@@ -784,7 +784,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_trivial
   template
 struct is_trivial
-: public integral_constant
+: public __bool_constant<__is_trivial(_Tp)>
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
"template argument must be a complete class or an unbounded array");
@@ -793,7 +793,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_trivially_copyable
   template
 struct is_trivially_copyable
-: public integral_constant
+: public __bool_constant<__is_trivially_copyable(_Tp)>
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
"template argument must be a complete class or an unbounded array");
@@ -802,7 +802,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_standard_layout
   template
 struct is_standard_layout
-: public integral_constant
+: public __bool_constant<__is_standard_layout(_Tp)>
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
"template argument must be a complete class or an unbounded array");
@@ -817,7 +817,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct
 _GLIBCXX20_DEPRECATED_SUGGEST("is_standard_layout && is_trivial")
 is_pod
-: public integral_constant
+: public __bool_constant<__is_pod(_Tp)>
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
"template argument must be a complete class or an unbounded array");
@@ -831,7 +831,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct
 _GLIBCXX17_DEPRECATED
 is_literal_type
-: public integral_constant
+: public __bool_constant<__is_literal_type(_Tp)>
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
"template argument must be a complete class or an unbounded array");
@@ -840,13 +840,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_empty
   template
 struct is_empty
-: public integral_constant
+: public __bool_constant<__is_empty(_Tp)>
 { };
 
   /// is_polymorphic
   template
 struct is_polymorphic
-: public integral_constant
+: public __bool_constant<__is_polymorphic(_Tp)>
 { };
 
 #if __cplusplus >= 201402L
@@ -855,14 +855,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// @since C++14
   template
 struct is_final
-: public integral_constant
+: public __bool_constant<__is_final(_Tp)>
 { };
 #endif
 
   /// is_abstract
   template
 struct is_abstract
-: public integral_constant
+: public __bool_constant<__is_abstract(_Tp)>
 { };
 
   /// @cond undocumented
@@ -873,7 +873,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 struct __is_signed_helper<_Tp, true>
-: public integral_constant
+: public __bool_constant<_Tp(-1) < _Tp(0)>
 { };
   /// @endcond
 
@@ -1333,7 +1333,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// has_virtual_destructor
   template
 struct has_virtual_destructor
-: public integral_constant
+: public __bool_constant<__has_virtual_destructor(_Tp)>
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
"template argument must be a complete class or an unbounded array");
@@ -1392,7 +1392,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_same
 #ifdef _GLIBCXX_HAVE_BUILTIN_IS_SAME
-: public integral_constant
+: public __bool_constant<__is_same(_Tp, _Up)>
 #else
 : public false_type
 #endif
@@ -1408,7 +1408,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_base_of
   template
 struct is_base_of
-: public integral_constant
+: public __bool_constant<__is_base_of(_Base, _Derived)>
 { };
 
 #if __has_builtin(__is_convertible)
-- 
2.40.0

[committed] amdgcn: vec_extract no-op insns

2023-03-23 Thread Andrew Stubbs


This patch adds new pseudo-insns for no-op vector extractions.

These were previously modelled as simple move instructions, but the 
register allocator has unhelpful special handling for these that 
triggered spills to memory. Modelling them as a vec_select does the 
right thing in the register allocator, but now the instruction has to be 
explicitly empty.


This patch has already been committed to the OG12 branch.

Andrewamdgcn: vec_extract no-op insns

Just using move insn for no-op conversions triggers special move handling in
IRA which declares that subreg of vectors aren't valid and routes everything
through memory.  These patterns make the vec_select explicit and all is well.

gcc/ChangeLog:

* config/gcn/gcn-protos.h (gcn_stepped_zero_int_parallel_p): New.
* config/gcn/gcn-valu.md (V_1REG_ALT): New.
(V_2REG_ALT): New.
(vec_extract_nop): New.
(vec_extract_nop): New.
(vec_extract): Use new patterns.
* config/gcn/gcn.cc (gcn_stepped_zero_int_parallel_p): New.
* config/gcn/predicates.md (ascending_zero_int_parallel): New.

diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h
index d7862b21a2a..287ce17d422 100644
--- a/gcc/config/gcn/gcn-protos.h
+++ b/gcc/config/gcn/gcn-protos.h
@@ -75,6 +75,7 @@ extern reg_class gcn_regno_reg_class (int regno);
 extern bool gcn_scalar_flat_address_p (rtx);
 extern bool gcn_scalar_flat_mem_p (rtx);
 extern bool gcn_sgpr_move_p (rtx, rtx);
+extern bool gcn_stepped_zero_int_parallel_p (rtx op, int step);
 extern bool gcn_valid_move_p (machine_mode, rtx, rtx);
 extern rtx gcn_vec_constant (machine_mode, int);
 extern rtx gcn_vec_constant (machine_mode, rtx);
diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index 787d7709d0d..334b6b0b51c 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -49,6 +49,13 @@ (define_mode_iterator V_1REG
   V16QI V16HI V16SI V16HF V16SF
   V32QI V32HI V32SI V32HF V32SF
   V64QI V64HI V64SI V64HF V64SF])
+(define_mode_iterator V_1REG_ALT
+ [V2QI V2HI V2SI V2HF V2SF
+  V4QI V4HI V4SI V4HF V4SF
+  V8QI V8HI V8SI V8HF V8SF
+  V16QI V16HI V16SI V16HF V16SF
+  V32QI V32HI V32SI V32HF V32SF
+  V64QI V64HI V64SI V64HF V64SF])
 
 (define_mode_iterator V_INT_1REG
  [V2QI V2HI V2SI
@@ -80,6 +87,13 @@ (define_mode_iterator V_2REG
   V16DI V16DF
   V32DI V32DF
   V64DI V64DF])
+(define_mode_iterator V_2REG_ALT
+ [V2DI V2DF
+  V4DI V4DF
+  V8DI V8DF
+  V16DI V16DF
+  V32DI V32DF
+  V64DI V64DF])
 
 ; Vector modes with native support
 (define_mode_iterator V_noQI
@@ -788,11 +802,36 @@ (define_insn "vec_extract"
(set_attr "exec" "none")
(set_attr "laneselect" "yes")])
 
+(define_insn "vec_extract_nop"
+  [(set (match_operand:V_1REG_ALT 0 "register_operand" "=v,v")
+   (vec_select:V_1REG_ALT
+ (match_operand:V_1REG 1 "register_operand"   " 0,v")
+ (match_operand 2 "ascending_zero_int_parallel" "")))]
+  "MODE_VF (mode) < MODE_VF (mode)
+   && mode == mode"
+  "@
+  ; in-place extract %0
+  v_mov_b32\t%L0, %L1"
+  [(set_attr "type" "vmult")
+   (set_attr "length" "0,8")])
+  
+(define_insn "vec_extract_nop"
+  [(set (match_operand:V_2REG_ALT 0 "register_operand" "=v,v")
+   (vec_select:V_2REG_ALT
+ (match_operand:V_2REG 1 "register_operand"   " 0,v")
+ (match_operand 2 "ascending_zero_int_parallel" "")))]
+  "MODE_VF (mode) < MODE_VF (mode)
+   && mode == mode"
+  "@
+  ; in-place extract %0
+  v_mov_b32\t%L0, %L1\;v_mov_b32\t%H0, %H1"
+  [(set_attr "type" "vmult")
+   (set_attr "length" "0,8")])
+  
 (define_expand "vec_extract"
-  [(set (match_operand:V_ALL_ALT 0 "register_operand")
-   (vec_select:V_ALL_ALT
- (match_operand:V_ALL 1 "register_operand")
- (parallel [(match_operand 2 "immediate_operand")])))]
+  [(match_operand:V_ALL_ALT 0 "register_operand")
+   (match_operand:V_ALL 1 "register_operand")
+   (match_operand 2 "immediate_operand")]
   "MODE_VF (mode) < MODE_VF (mode)
&& mode == mode"
   {
@@ -802,8 +841,12 @@ (define_expand "vec_extract"
 
 if (firstlane == 0)
   {
-   /* A plain move will do.  */
-   tmp = operands[1];
+   rtx parallel = gen_rtx_PARALLEL (mode,
+ rtvec_alloc (numlanes));
+   for (int i = 0; i < numlanes; i++)
+ XVECEXP (parallel, 0, i) = GEN_INT (i);
+   emit_insn (gen_vec_extract_nop
+  (operands[0], operands[1], parallel));
   } else {
 /* FIXME: optimize this by using DPP where available.  */
 
@@ -815,10 +858,10 @@ (define_expand "vec_extract"
tmp

[committed] amdgcn: Fix register size bug

2023-03-23 Thread Andrew Stubbs

This patch fixes a bug in which the function prologue would save more 
registers to the stack than there was space allocated. This would cause 
data corruption when the epilogue restored the registers if a child 
function had overwritten that memory.


The problem was caused by insn constraints that allow vectors to be 
placed in scalar registers. This isn't usually allowed without an 
explicit vec_duplicate in the pattern, but some post-reload splitters do 
it sometimes (hence the size calculation mismatch).


A full fix would add vec_duplicate variants of all the instructions that 
support this, but that's a huge explosion of patterns, and this fix is 
enough for correctness, for now.


This has already been committed to the OG12 branch.

Andrewamdgcn: Fix register size bug

Fix an issue in which "vectors" of duplicate entries placed in scalar
registers caused the following 63 registers to be marked live, for the
purpose of prologue generation, which resulted in stack corruption.

gcc/ChangeLog:

* config/gcn/gcn.cc (gcn_class_max_nregs): Handle vectors in SGPRs.
(move_callee_saved_registers): Detect the bug condition early.

diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 5bf88e98083..a7d278cd2f8 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -492,6 +492,15 @@ gcn_class_max_nregs (reg_class_t rclass, machine_mode mode)
 }
   else if (rclass == VCC_CONDITIONAL_REG && mode == BImode)
 return 2;
+
+  /* Vector modes in SGPRs are not supposed to happen (disallowed by
+ gcn_hard_regno_mode_ok), but there are some patterns that have an "Sv"
+ constraint and are used by splitters, post-reload.
+ This ensures that we don't accidentally mark the following 63 scalar
+ registers as "live".  */
+  if (rclass == SGPR_REGS && VECTOR_MODE_P (mode))
+return CEIL (GET_MODE_SIZE (GET_MODE_INNER (mode)), 4);
+
   return CEIL (GET_MODE_SIZE (mode), 4);
 }
 
@@ -3239,6 +3248,10 @@ move_callee_saved_registers (rtx sp, machine_function 
*offsets,
   emit_insn (move_vectors);
   emit_insn (move_scalars);
 }
+
+  /* This happens when a new register becomes "live" after reload.
+ Check your splitters!  */
+  gcc_assert (offset <= offsets->callee_saves);
 }
 
 /* Generate prologue.  Called from gen_prologue during pro_and_epilogue pass.

Re: Blockchain Consulting

2023-03-23 Thread Jessica James via Gcc-patches


Hello,

I am writing to follow up on my email.

/*Can we get on a call on *//*Monday (27th March)*//*or Tuesday (28th 
March) so we can discuss Blockchain Solution requirements you may have? */


Please suggest a day and time to connect and also share the best number 
to reach you.


Thank you
Jessica James

On 23/12/2022 2:27 PM, Jessica James wrote:

Hello - Greetings,

*"Create a network in Avalanche and Polygon for your company to dominate 
Web3!*


The trend towards custom development of public blockchain networks has 
been popular with organizations as the recent Avalanche and Polygon 
public releases have enabled this. The incorporation of such networks on 
the network adds all the good points of public & private blockchain 
developments. We are up to date on modern trends by equipping our 
professionals with the ability to create one-of-a-kind blockchains using 
Avalanche & Polygon.


*The advantages of custom Blockchains are Privacy, High Throughput and 
Increased Transparency.*


We have been an excellent provider of Web3 end-to-end applications for 
businesses for over seven years. Now, our capabilities using Avalanche 
and Polygon Nightfall add up to the list. Along with the access to the 
new-gen tech stacks, By keeping the customer at the forefront of all 
applications, while ensuring they’re functional and efficient, we can 
create top-quality blockchain solutions for you. Give our experts a call 
today to get started!


*We Also Develop Numerous Web3-specific Applications, Including:*

 * /NFT Marketplace/
 * /Crypto exchange/
 * /Defi Wallet/
 * /Metaverses/
 * /Crypto Fundraising Applications/


*If you need to maintain privacy for your business project, we can use 
the power of public blockchains while still offering a customized 
blockchain option.*


*Can we have a free consultation call – we'll tell you how to revamp 
your existing system or hit the market with a new solution?*


Please suggest a day/time and share the best number to reach you.

Thank you
Jessica James

Re: [PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Peter Bergner via Gcc-patches

On 3/23/23 5:38 AM, Ajit Agarwal wrote:
> This patch removed unnecessary signed extension elimination in ree pass.
> Bootstrapped and regtested on powerpc64-linux-gnu.
> 
> 
> Thanks & Regards
> Ajit
> 
>   rtl-optimization: ppc backend generates unnecessary signed extension.
> 
>   Eliminate unnecessary redundant signed extension.
> 
>   2023-03-23  Ajit Kumar Agarwal  
> 
> gcc/ChangeLog:
> 
>   * ree.cc: Modification for  AND opcode support to eliminate
>   unnecessary signed extension.
>   * testsuite/g++.target/powerpc/sext-elim.C: New tests.

Not a review of the patch, but we talked offline about other bugzillas
regarding unnecessary sign and zero extensions.  Doing a quick scan, I
see the following bugs.  Please have a look at 1) whether these are
still a problem with unpatched trunk, and if they are, 2) whether your
patch fixes them or could fix them.  Thanks.

https://gcc.gnu.org/PR41742
https://gcc.gnu.org/PR65010
https://gcc.gnu.org/PR82940
https://gcc.gnu.org/PR107949

Peter

Fwd: [V5][PATCH 0/2] Handle component_ref to a structure/union field including FAM for builtin_object_size

2023-03-23 Thread Qing Zhao via Gcc-patches

Ping…

Please let me know if you have any further comments on the patch.

thanks.

Qing

Begin forwarded message:

From: Qing Zhao mailto:qing.z...@oracle.com>>
Subject: [V5][PATCH 0/2] Handle component_ref to a structure/union field 
including FAM for builtin_object_size
Date: March 16, 2023 at 5:47:13 PM EDT
To: jos...@codesourcery.com, 
ja...@redhat.com, 
san...@codesourcery.com
Cc: rguent...@suse.de, 
siddh...@gotplt.org, 
keesc...@chromium.org, 
gcc-patches@gcc.gnu.org, Qing Zhao 
mailto:qing.z...@oracle.com>>

Hi, Joseph, Jakub and Sandra,

Could you please review this patch and let me know whether it’s ready
for committing into GCC13?

The fix to Bug PR101832 is an important patch for kernel security
purpose. it's better to be put into GCC13.

===

These are the 5th version of the patches for PR101832, to fix
builtin_object_size to correctly handle component_ref to a
structure/union field that includes a flexible array member.

also includes a documentation update for the GCC extension on embedding
a structure/union with flexible array member into another structure.
which includes a fix to PR77650.

compared to the 4th version of the patch, the major changes are:

1. Update the documentation per Sandra's comments and
suggestion.
2. per Richard's suggestion, let the new bit TYPE_INCLUDE_FLEXARRAY to
share the same bit with no_named_args_stdarg_p to save space in the IR.
and corresponding changes to support such sharing.
3. I also changed the code inside 
tree-object-size.cc to make it cleaner
and easier to be understood.

bootstrapped and regression tested on aarch64 and x86.

Okay for commit?

thanks.

Qing

Qing Zhao (2):
 Handle component_ref to a structre/union field including  flexible
   array member [PR101832]
 Update documentation to clarify a GCC extension

gcc/c-family/c.opt|   5 +
gcc/c/c-decl.cc   |  19 +++
gcc/doc/extend.texi   |  45 +-
gcc/lto/lto-common.cc |   5 +-
gcc/print-tree.cc |   5 +
.../gcc.dg/builtin-object-size-pr101832.c | 134 ++
.../gcc.dg/variable-sized-type-flex-array.c   |  31 
gcc/tree-core.h   |   2 +
gcc/tree-object-size.cc   |  23 
++-
gcc/tree-streamer-in.cc   |   5 
+-
gcc/tree-streamer-out.cc  |   
5 +-
gcc/tree.h|   7 +-
12 files changed, 280 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c
create mode 100644 gcc/testsuite/gcc.dg/variable-sized-type-flex-array.c

--
2.31.1

Fwd: [V5][PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-03-23 Thread Qing Zhao via Gcc-patches

Ping…

Please let me know if you have any further comments on the patch.

thanks.

Qing


Begin forwarded message:

From: Qing Zhao mailto:qing.z...@oracle.com>>
Subject: [V5][PATCH 1/2] Handle component_ref to a structre/union field 
including flexible array member [PR101832]
Date: March 16, 2023 at 5:47:14 PM EDT
To: jos...@codesourcery.com, 
ja...@redhat.com, 
san...@codesourcery.com
Cc: rguent...@suse.de, 
siddh...@gotplt.org, 
keesc...@chromium.org, 
gcc-patches@gcc.gnu.org, Qing Zhao 
mailto:qing.z...@oracle.com>>

GCC extension accepts the case when a struct with a flexible array member
is embedded into another struct or union (possibly recursively).
__builtin_object_size should treat such struct as flexible size per
-fstrict-flex-arrays.

gcc/c/ChangeLog:

PR tree-optimization/101832
* c-decl.cc (finish_struct): Set TYPE_INCLUDE_FLEXARRAY for
struct/union type.

gcc/lto/ChangeLog:

PR tree-optimization/101832
* lto-common.cc (compare_tree_sccs_1): Compare bit
TYPE_NO_NAMED_ARGS_STDARG_P or TYPE_INCLUDE_FLEXARRAY properly
for its corresponding type.

gcc/ChangeLog:

PR tree-optimization/101832
* print-tree.cc (print_node): Print new bit 
type_include_flexarray.
* tree-core.h (struct tree_type_common): Use bit no_named_args_stdarg_p
as type_include_flexarray for RECORD_TYPE or UNION_TYPE.
* tree-object-size.cc (addr_object_size): Handle 
structure/union type
when it has flexible size.
* tree-streamer-in.cc 
(unpack_ts_type_common_value_fields): Stream
in bit no_named_args_stdarg_p properly for its corresponding type.
* tree-streamer-out.cc 
(pack_ts_type_common_value_fields): Stream
out bit no_named_args_stdarg_p properly for its corresponding type.
* tree.h (TYPE_INCLUDE_FLEXARRAY): New macro TYPE_INCLUDE_FLEXARRAY.

gcc/testsuite/ChangeLog:

PR tree-optimization/101832
* gcc.dg/builtin-object-size-pr101832.c: New test.
---
gcc/c/c-decl.cc   |  11 ++
gcc/lto/lto-common.cc |   5 +-
gcc/print-tree.cc |   5 +
.../gcc.dg/builtin-object-size-pr101832.c | 134 ++
gcc/tree-core.h   |   2 +
gcc/tree-object-size.cc   |  23 
++-
gcc/tree-streamer-in.cc   |   5 
+-
gcc/tree-streamer-out.cc  |   
5 +-
gcc/tree.h|   7 +-
9 files changed, 192 insertions(+), 5 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c

diff --git a/gcc/c/c-decl.cc 
b/gcc/c/c-decl.cc
index e537d33f398..14c54809b9d 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9258,6 +9258,17 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
  /* Set DECL_NOT_FLEXARRAY flag for FIELD_DECL x.  */
  DECL_NOT_FLEXARRAY (x) = !is_flexible_array_member_p (is_last_field, x);

+  /* Set TYPE_INCLUDE_FLEXARRAY for the context of x, t.
+ when x is an array and is the last field.  */
+  if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE)
+ TYPE_INCLUDE_FLEXARRAY (t)
+  = is_last_field && flexible_array_member_type_p (TREE_TYPE (x));
+  /* Recursively set TYPE_INCLUDE_FLEXARRAY for the context of x, t
+ when x is an union or record and is the last field.  */
+  else if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
+ TYPE_INCLUDE_FLEXARRAY (t)
+  = is_last_field && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x));
+
  if (DECL_NAME (x)
 || RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
saw_named_field = true;
diff --git a/gcc/lto/lto-common.cc 
b/gcc/lto/lto-common.cc
index 882dd8971a4..9dde7118266 100644
--- a/gcc/lto/lto-common.cc
+++ b/gcc/lto/lto-common.cc
@@ -1275,7 +1275,10 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map)
  if (AGGREGATE_TYPE_P (t1))
compare_values (TYPE_TYPELESS_STORAGE);
  compare_values (TYPE_EMPTY_P);
-  compare_values (TYPE_NO_NAMED_ARGS_STDARG_P);
+  if (FUNC_OR_METHOD_TYPE_P (t1))
+ compare_values (TYPE_NO_NAMED_ARGS_STDARG_P);
+  if (RECORD_OR_UNION_TYPE_P (t1))
+ compare_values (TYPE_INCLUDE_FLEXARRAY);
  compare_values (TYPE_PACKED);
  compare_values (TYPE_RESTRICT);
  compare_values (TYPE_USER_ALIGN);
diff --git a/gcc/print-tree.cc 
b/gcc/print-tree.cc
index 1f3afcbbc86..efacdb7686f 100644
--- a/gcc/print-tree.cc

Fwd: [V5][PATCH 2/2] Update documentation to clarify a GCC extension

2023-03-23 Thread Qing Zhao via Gcc-patches

Ping…

Please let me know if you have any further comments on the patch.

thanks.

Qing

Begin forwarded message:

From: Qing Zhao mailto:qing.z...@oracle.com>>
Subject: [V5][PATCH 2/2] Update documentation to clarify a GCC extension
Date: March 16, 2023 at 5:47:15 PM EDT
To: jos...@codesourcery.com, 
ja...@redhat.com, 
san...@codesourcery.com
Cc: rguent...@suse.de, 
siddh...@gotplt.org, 
keesc...@chromium.org, 
gcc-patches@gcc.gnu.org, Qing Zhao 
mailto:qing.z...@oracle.com>>

on a structure with a C99 flexible array member being nested in
another structure. (PR77650)

"GCC extension accepts a structure containing an ISO C99 "flexible array
member", or a union containing such a structure (possibly recursively)
to be a member of a structure.

There are two situations:

  * A structure or a union with a C99 flexible array member is the last
field of another structure, for example:

 struct flex  { int length; char data[]; };
 union union_flex { int others; struct flex f; };

 struct out_flex_struct { int m; struct flex flex_data; };
 struct out_flex_union { int n; union union_flex flex_data; };

In the above, both 'out_flex_struct.flex_data.data[]' and
'out_flex_union.flex_data.f.data[]' are considered as flexible
arrays too.

  * A structure or a union with a C99 flexible array member is the
middle field of another structure, for example:

 struct flex  { int length; char data[]; };

 struct mid_flex { int m; struct flex flex_data; int n; };

In the above, 'mid_flex.flex_data.data[]' has undefined behavior.
Compilers do not handle such case consistently, Any code relying on
such case should be modified to ensure that flexible array members
only end up at the ends of structures.

Please use warning option '-Wgnu-variable-sized-type-not-at-end' to
identify all such cases in the source code and modify them.  This
extension will be deprecated from gcc in the next release.
"

gcc/c-family/ChangeLog:

* c.opt: New option -Wgnu-variable-sized-type-not-at-end.

gcc/c/ChangeLog:

* c-decl.cc (finish_struct): Issue warnings for new option.

gcc/ChangeLog:

* doc/extend.texi: Document GCC extension on a structure containing
a flexible array member to be a member of another structure.

gcc/testsuite/ChangeLog:

* gcc.dg/variable-sized-type-flex-array.c: New test.
---
gcc/c-family/c.opt|  5 +++
gcc/c/c-decl.cc   |  8 
gcc/doc/extend.texi   | 45 ++-
.../gcc.dg/variable-sized-type-flex-array.c   | 31 +
4 files changed, 88 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.dg/variable-sized-type-flex-array.c

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index cddeece..660ac07f3d4 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -737,6 +737,11 @@ Wformat-truncation=
C ObjC C++ LTO ObjC++ Joined RejectNegative UInteger Var(warn_format_trunc) 
Warning LangEnabledBy(C ObjC C++ LTO ObjC++,Wformat=, warn_format >= 1, 0) 
IntegerRange(0, 2)
Warn about calls to snprintf and similar functions that truncate output.

+Wgnu-variable-sized-type-not-at-end
+C C++ Var(warn_variable_sized_type_not_at_end) Warning
+Warn about structures or unions with C99 flexible array members are not
+at the end of a structure.
+
Wif-not-aligned
C ObjC C++ ObjC++ Var(warn_if_not_aligned) Init(1) Warning
Warn when the field in a struct is not aligned.
diff --git a/gcc/c/c-decl.cc 
b/gcc/c/c-decl.cc
index 14c54809b9d..1632043a8ff 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9269,6 +9269,14 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
TYPE_INCLUDE_FLEXARRAY (t)
 = is_last_field && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x));

+  if (warn_variable_sized_type_not_at_end
+  && !is_last_field
+  && RECORD_OR_UNION_TYPE_P (TREE_TYPE (x))
+  && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x)))
+ warning_at (DECL_SOURCE_LOCATION (x),
+OPT_Wgnu_variable_sized_type_not_at_end,
+"variable sized type not at the end of a struct");
+
  if (DECL_NAME (x)
 || RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
saw_named_field = true;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index fd3745c5608..0928b962a60 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1748,7 +1748,50 @@ Flexible array members may only appear as the last 
member of a
A structure containing a flexible array member, or a union containing
such a structure (possibly recursively), may not be a member of a
structure or an element of an array.  (However, these uses are
-permitted by GCC as extensions.)
+pe

[PATCH] ranger: Ranger meets aspell

2023-03-23 Thread Jakub Jelinek via Gcc-patches

Hi!

I've noticed a comment typo in tree-vrp.cc and decided to quickly
skim aspell -c on the ranger sources (with quick I on everything that
looked ok or roughly ok).
But not being a native English speaker, I could get stuff wrong.
The most questionable seems 3 occurrences of involutary/involutory,
I've used involuntary, but am not sure if that's what you meant.

Could you please quickly skim it if I'm not making things worse?
Ok for trunk?

2023-03-23  Jakub Jelinek  

* value-range.cc (irange::irange_union, irange::intersect): Fix
comment spelling bugs.
* gimple-range-trace.cc (range_tracer::do_header): Likewise.
* gimple-range-trace.h: Likewise.
* gimple-range-edge.cc: Likewise.
(gimple_outgoing_range_stmt_p,
gimple_outgoing_range::switch_edge_range,
gimple_outgoing_range::edge_range_p): Likewise.
* gimple-range.cc (gimple_ranger::prefill_stmt_dependencies,
gimple_ranger::fold_stmt, gimple_ranger::register_transitive_infer,
assume_query::assume_query, assume_query::calculate_phi): Likewise.
* gimple-range-edge.h: Likewise.
* value-range.h (Value_Range::set, Value_Range::lower_bound,
Value_Range::upper_bound, frange::set_undefined): Likewise.
* gimple-range-gori.h (range_def_chain::depend, gori_map::m_outgoing,
gori_compute): Likewise.
* gimple-range-fold.h (fold_using_range): Likewise.
* gimple-range-path.cc (path_range_query::compute_ranges_in_phis):
Likewise.
* gimple-range-gori.cc (range_def_chain::in_chain_p,
range_def_chain::dump, gori_map::calculate_gori,
gori_compute::compute_operand_range_switch,
gori_compute::logical_combine, gori_compute::refine_using_relation,
gori_compute::compute_operand1_range, gori_compute::may_recompute_p):
Likewise.
* gimple-range.h: Likewise.
(enable_ranger): Likewise.
* range-op.h (empty_range_varying): Likewise.
* value-query.h (value_query): Likewise.
* gimple-range-cache.cc (block_range_cache::set_bb_range,
block_range_cache::dump, ssa_global_cache::clear_global_range,
temporal_cache::temporal_value, temporal_cache::current_p,
ranger_cache::range_of_def, ranger_cache::propagate_updated_value,
ranger_cache::range_from_dom, ranger_cache::register_inferred_value):
Likewise.
* gimple-range-fold.cc (fur_edge::get_phi_operand,
fur_stmt::get_operand, gimple_range_adjustment,
fold_using_range::range_of_phi,
fold_using_range::relation_fold_and_or): Likewise.
* value-range-storage.h (irange_storage_slot::MAX_INTS): Likewise.
* value-query.cc (range_query::value_of_expr,
range_query::value_on_edge, range_query::query_relation): Likewise.
* tree-vrp.cc (remove_unreachable::remove_and_update_globals,
intersect_range_with_nonzero_bits): Likewise.
* gimple-range-infer.cc (gimple_infer_range::check_assume_func,
exit_range): Likewise.
* value-relation.h: Likewise.
(equiv_oracle, relation_trio::relation_trio, value_relation,
value_relation::value_relation, pe_min): Likewise.
* range-op-float.cc (range_operator_float::rv_fold,
frange_arithmetic, foperator_unordered_equal::op1_range,
foperator_div::rv_fold): Likewise.
* gimple-range-op.cc (cfn_clz::fold_range): Likewise.
* value-relation.cc (equiv_oracle::query_relation,
equiv_oracle::register_equiv, equiv_oracle::add_equiv_to_block,
value_relation::apply_transitive, relation_chain_head::find_relation,
dom_oracle::query_relation, dom_oracle::find_relation_block,
dom_oracle::find_relation_dom, path_oracle::register_equiv): Likewise.
* range-op.cc (range_operator::wi_fold_in_parts_equiv,
create_possibly_reversed_range, adjust_op1_for_overflow,
operator_mult::wi_fold, operator_exact_divide::op1_range,
operator_cast::lhs_op1_relation, operator_cast::fold_pair,
operator_cast::fold_range, operator_logical_not::op1_range,
operator_bitwise_not::op1_range, operator_abs::wi_fold,
operator_negate::op1_range, range_op_cast_tests,
range_op_lshift_tests): Likewise.

--- gcc/value-range.cc.jj   2023-03-22 10:42:17.863403734 +0100
+++ gcc/value-range.cc  2023-03-23 13:39:17.242244618 +0100
@@ -2441,7 +2441,7 @@ irange::irange_union (const irange &r)
   // of each range is <= the beginning of the next range.  There may
   // be overlapping ranges at this point.  I.e. this would be valid
   // [-20, 10], [-10, 0], [0, 20], [40, 90] as it satisfies this
-  // contraint : -20 < -10 < 0 < 40.  When the range is rebuilt into r,
+  // constraint : -20 < -10 < 0 < 40.  When the range is rebuilt into r,
   // the merge is performed.
   //
   // [Xi,Yi]..[Xn,Yn]  U  [Xj,Yj]..[Xm,Ym]   -->  [Xk,Yk]..[Xp,Yp]
@@ -2710,7 +2710,7 @@ irange::in

[OG12][committed] Fortran: Add attr.class_ok check for generate_callback_wrapper

2023-03-23 Thread Tobias Burnus


On OG12, the OpenMP deep-mapping support added a callback procedure to the 
vtable.
That one did not handle error recovery well (ICE when a CLASS component as not 
(class_)ok.

The attached patch has been committed as 
https://gcc.gnu.org/g:9c18db65914a751e4a1d9330ccc1659fe5ef270d
and applies only to OG12 (= git branch devel/omp/gcc-12) as mainline does not 
have this
code (yet).

* * *

The plan is to upstream the deep-mapping support, i.e. mapping of allocatable 
components.
The current OG12 implementation handles both mapping the declared type and the 
dynamic type,
the latter requires the wrapper, generated by generate_callback_wrapper.

I plan to upstream first the static part - and only then think about the 
wrapper. I think
the wrapper could be useful for coarrays as well - namely, for the user-defined 
reduction,
but I have not fully thought about it. It would break the ABI as the vtable 
gets another
entry before the type-bound procedures, which is why I am a bit hesitant; it it 
gets merged,
we it would be the opportunity to change some other things as well - like: 
generating the
CLASS functions/vtable only when used. (→ weak symbols to permit it in multiple 
translation
units; storing the fact that it has been generated in the module.)
But that's offtopic.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit 9c18db65914a751e4a1d9330ccc1659fe5ef270d
Author: Tobias Burnus 
Date:   Thu Mar 23 14:29:00 2023 +0100

Fortran: Add attr.class_ok check for generate_callback_wrapper

Proper variables/components of type BT_CLASS have 'class_ok' set; check
for that to avoid an ICE on invalid code for gfortran.dg/pr108434.f90.

gcc/fortran/
* class.cc (generate_callback_wrapper): Add attr.class_ok check.
* resolve.cc (resolve_fl_derived): Likewise.

diff --git a/gcc/fortran/ChangeLog.omp b/gcc/fortran/ChangeLog.omp
index 663102d9329..f7d1f91f178 100644
--- a/gcc/fortran/ChangeLog.omp
+++ b/gcc/fortran/ChangeLog.omp
@@ -1,3 +1,8 @@
+2023-03-23  Tobias Burnus  
+
+	* class.cc (generate_callback_wrapper): Add attr.class_ok check.
+	* resolve.cc (resolve_fl_derived): Likewise.
+
 2023-03-23  Tobias Burnus  
 
 	* trans-openmp.cc (gfc_trans_omp_clauses): Fix unmapping of
diff --git a/gcc/fortran/class.cc b/gcc/fortran/class.cc
index 35dc35d2ee6..7ab6923523f 100644
--- a/gcc/fortran/class.cc
+++ b/gcc/fortran/class.cc
@@ -2550,6 +2550,9 @@ generate_callback_wrapper (gfc_symbol *vtab, gfc_symbol *derived,
 	 cb (token, comp->var(.data), size, 0, var's cb fn);  */
   for (gfc_component *comp = derived->components; comp; comp = comp->next)
 {
+  if (__builtin_expect (comp->ts.type == BT_CLASS
+			&& !comp->attr.class_ok, 0))
+	continue;
   bool pointer = (comp->ts.type == BT_CLASS
 		  ? CLASS_DATA (comp)->attr.pointer : comp->attr.pointer);
   bool proc_ptr = comp->attr.proc_pointer;
@@ -2590,7 +2593,7 @@ generate_callback_wrapper (gfc_symbol *vtab, gfc_symbol *derived,
 	  size->where = gfc_current_locus;
 	}
 
-  if (!proc_ptr && comp->ts.type == BT_CLASS)
+  if (!proc_ptr && comp->ts.type == BT_CLASS && comp->attr.class_ok)
 	{
 	  gfc_add_data_component (expr);
 	  if (comp->attr.dimension)
diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index aaeaf396b91..15db1252366 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -15173,7 +15173,8 @@ resolve_fl_derived (gfc_symbol *sym)
   gfc_component *c = (sym->attr.is_class
 		  ? CLASS_DATA (sym->components) : sym->components);
   for ( ; c; c = c->next)
-if ((c->ts.type == BT_DERIVED || c->ts.type == BT_CLASS)
+if ((c->ts.type == BT_DERIVED
+	|| (c->ts.type == BT_CLASS && c->attr.class_ok))
 	&& !c->ts.u.derived->resolve_symbol_called)
   {
 	if (c->ts.u.derived->components == NULL

Re: [PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Jeff Law via Gcc-patches





On 3/23/23 04:38, Ajit Agarwal wrote:


Hello All:

This patch removed unnecessary signed extension elimination in ree pass.
Bootstrapped and regtested on powerpc64-linux-gnu.


Thanks & Regards
Ajit

rtl-optimization: ppc backend generates unnecessary signed extension.

Eliminate unnecessary redundant signed extension.

2023-03-23  Ajit Kumar Agarwal  

gcc/ChangeLog:

* ree.cc: Modification for  AND opcode support to eliminate
unnecessary signed extension.
* testsuite/g++.target/powerpc/sext-elim.C: New tests.
Just a note.  I'll look at this once the trunk is open for gcc-14 
development.  It's really not appropriate for gcc-13.


jeff

Re: [PATCH] ranger: Ranger meets aspell

2023-03-23 Thread Aldy Hernandez via Gcc-patches





On 3/23/23 14:32, Jakub Jelinek wrote:

Hi!

I've noticed a comment typo in tree-vrp.cc and decided to quickly
skim aspell -c on the ranger sources (with quick I on everything that
looked ok or roughly ok).
But not being a native English speaker, I could get stuff wrong.
The most questionable seems 3 occurrences of involutary/involutory,
I've used involuntary, but am not sure if that's what you meant.


An involutary function is a function who is its own inverse, so AFAIK 
"involutary" is correct.  That being said, it's not a common word, and I 
must admit I had to look it up when I wrote the comment...mostly to 
sound smart ;-), so feel free to use another word to avoid confusion.




Could you please quickly skim it if I'm not making things worse?
Ok for trunk?


Otherwise, looks good.  Thanks for doing this.

Aldy

(testsuite] Skip gnat.dg/div_zero.adb on Aarch64

2023-03-23 Thread Eric Botcazou via Gcc-patches

For the same reason as on PowerPC.

Tested on Aarch64/Linux, applied on the mainline and 12 branch.


2023-03-23  Eric Botcazou  

* gnat.dg/div_zero.adb: Skip for aarch64*-*-* targets.

-- 
Eric Botcazoudiff --git a/gcc/testsuite/gnat.dg/div_zero.adb b/gcc/testsuite/gnat.dg/div_zero.adb
index de88951389b..dedf3928db7 100644
--- a/gcc/testsuite/gnat.dg/div_zero.adb
+++ b/gcc/testsuite/gnat.dg/div_zero.adb
@@ -1,5 +1,5 @@
 -- { dg-do run }
--- { dg-skip-if "divide does not trap" { powerpc*-*-* } }
+-- { dg-skip-if "divide does not trap" { aarch64*-*-* powerpc*-*-* } }
 
 -- This test requires architecture- and OS-specific support code for unwinding
 -- through signal frames (typically located in *-unwind.h) to pass.  Feel free

Re: [PATCH] ranger: Ranger meets aspell

2023-03-23 Thread Jakub Jelinek via Gcc-patches

On Thu, Mar 23, 2023 at 03:05:45PM +0100, Aldy Hernandez wrote:
> 
> 
> On 3/23/23 14:32, Jakub Jelinek wrote:
> > Hi!
> > 
> > I've noticed a comment typo in tree-vrp.cc and decided to quickly
> > skim aspell -c on the ranger sources (with quick I on everything that
> > looked ok or roughly ok).
> > But not being a native English speaker, I could get stuff wrong.
> > The most questionable seems 3 occurrences of involutary/involutory,
> > I've used involuntary, but am not sure if that's what you meant.
> 
> An involutary function is a function who is its own inverse, so AFAIK
> "involutary" is correct.  That being said, it's not a common word, and I

Isn't that involutory?
https://en.wiktionary.org/wiki/involutory
Though, seems searching for it reveals both spellings and it isn't clear
what is the difference.  I'll keep them as is then.

Jakub

Re: [PATCH] ranger: Ranger meets aspell

2023-03-23 Thread Aldy Hernandez via Gcc-patches





On 3/23/23 15:20, Jakub Jelinek wrote:

On Thu, Mar 23, 2023 at 03:05:45PM +0100, Aldy Hernandez wrote:



On 3/23/23 14:32, Jakub Jelinek wrote:

Hi!

I've noticed a comment typo in tree-vrp.cc and decided to quickly
skim aspell -c on the ranger sources (with quick I on everything that
looked ok or roughly ok).
But not being a native English speaker, I could get stuff wrong.
The most questionable seems 3 occurrences of involutary/involutory,
I've used involuntary, but am not sure if that's what you meant.


An involutary function is a function who is its own inverse, so AFAIK
"involutary" is correct.  That being said, it's not a common word, and I


Isn't that involutory?
https://en.wiktionary.org/wiki/involutory


Ooops, yeah.


Though, seems searching for it reveals both spellings and it isn't clear
what is the difference.  I'll keep them as is then.


Whatever works.  Obviously I don't know how to spell it either :).

Aldy

[PATCH] c++: outer 'this' leaking into local class [PR106969]

2023-03-23 Thread Patrick Palka via Gcc-patches

Here when resolving the implicit object for '&wrapped' within the
local class Foo, we expect to obtain a dummy object of type Foo& since
there's no 'this' available in this context.  And yet at this point
current_class_ref still corresponds to the outer class Context (and is
const), which confuses maybe_dummy_object into propagating the cv-quals
of current_class_ref and returning an object of type const Foo&.  Thus
decltype(&wrapped) wrongly yields const int* instead of int*.

The problem ultimately seems to be that the 'this' from the enclosing
class appears available for use when parsing the local class, but 'this'
shouldn't leak across classes like that.  This patch fixes this by
clearing current_class_ptr/ref when parsing a class definition.

After this change, for name-clash11.C in C++98 mode we would now
complain about an invalid use of 'this' for e.g.

  ASSERT (sizeof (this->A) == 16);

due to the way the ASSERT macro is defined using a local class.  This
patch redefines it using a local typedef instead.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/12?

PR c++/106969

gcc/cp/ChangeLog:

* parser.cc (cp_parser_class_specifier): Clear current_class_ptr
and current_class_ref when parsing a class definition.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/name-clash11.C: New test.
* g++.dg/lookup/this2.C: New test.
---
 gcc/cp/parser.cc   | 13 +
 gcc/testsuite/g++.dg/lookup/name-clash11.C |  2 +-
 gcc/testsuite/g++.dg/lookup/this2.C| 22 ++
 3 files changed, 32 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lookup/this2.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index a277003ea58..be9c77b415e 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -26151,6 +26151,11 @@ cp_parser_class_specifier (cp_parser* parser)
   saved_in_unbraced_linkage_specification_p
 = parser->in_unbraced_linkage_specification_p;
   parser->in_unbraced_linkage_specification_p = false;
+  /* 'this' from an enclosing non-static member function is unvailable.  */
+  tree saved_ccp = current_class_ptr;
+  tree saved_ccr = current_class_ref;
+  current_class_ptr = NULL_TREE;
+  current_class_ref = NULL_TREE;
 
   /* Start the class.  */
   if (nested_name_specifier_p)
@@ -26369,8 +26374,6 @@ cp_parser_class_specifier (cp_parser* parser)
   /* If there are noexcept-specifiers that have not yet been processed,
 take care of them now.  Do this before processing NSDMIs as they
 may depend on noexcept-specifiers already having been processed.  */
-  tree save_ccp = current_class_ptr;
-  tree save_ccr = current_class_ref;
   FOR_EACH_VEC_SAFE_ELT (unparsed_noexcepts, ix, decl)
{
  tree ctx = DECL_CONTEXT (decl);
@@ -26496,8 +26499,8 @@ cp_parser_class_specifier (cp_parser* parser)
}
   vec_safe_truncate (unparsed_contracts, 0);
 
-  current_class_ptr = save_ccp;
-  current_class_ref = save_ccr;
+  current_class_ptr = NULL_TREE;
+  current_class_ref = NULL_TREE;
   if (pushed_scope)
pop_scope (pushed_scope);
 
@@ -26529,6 +26532,8 @@ cp_parser_class_specifier (cp_parser* parser)
 = saved_num_template_parameter_lists;
   parser->in_unbraced_linkage_specification_p
 = saved_in_unbraced_linkage_specification_p;
+  current_class_ptr = saved_ccp;
+  current_class_ref = saved_ccr;
 
   return type;
 }
diff --git a/gcc/testsuite/g++.dg/lookup/name-clash11.C 
b/gcc/testsuite/g++.dg/lookup/name-clash11.C
index bc63645e8d3..2ae9a65264d 100644
--- a/gcc/testsuite/g++.dg/lookup/name-clash11.C
+++ b/gcc/testsuite/g++.dg/lookup/name-clash11.C
@@ -7,7 +7,7 @@
 #  define ASSERT(e) static_assert (e, #e)
 #else
 #  define ASSERT(e) \
-  do { struct S { bool: !!(e); } asrt; (void)&asrt; } while (0)
+  do { typedef int asrt[bool(e) ? 1 : -1]; } while (0)
 #endif
 
 
diff --git a/gcc/testsuite/g++.dg/lookup/this2.C 
b/gcc/testsuite/g++.dg/lookup/this2.C
new file mode 100644
index 000..1450c563d92
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/this2.C
@@ -0,0 +1,22 @@
+// PR c++/106969
+// { dg-do compile { target c++11 } }
+
+struct Context
+{
+void
+action() const
+{
+struct Foo
+{
+int wrapped;
+decltype( &wrapped ) get() { return &wrapped; }
+} t;
+
+*t.get()= 42; // OK, get() returns int* not const int*
+
+struct Bar
+{
+using type = decltype(this); // { dg-error "invalid use of 'this'" 
}
+};
+}
+};
-- 
2.40.0.130.g27d43aaaf5

Re: [PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Ajit Agarwal via Gcc-patches

Hello Peter:

On 23/03/23 6:08 pm, Peter Bergner wrote:
> On 3/23/23 5:38 AM, Ajit Agarwal wrote:
>> This patch removed unnecessary signed extension elimination in ree pass.
>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>
>>
>> Thanks & Regards
>> Ajit
>>
>>  rtl-optimization: ppc backend generates unnecessary signed extension.
>>
>>  Eliminate unnecessary redundant signed extension.
>>
>>  2023-03-23  Ajit Kumar Agarwal  
>>
>> gcc/ChangeLog:
>>
>>  * ree.cc: Modification for  AND opcode support to eliminate
>>  unnecessary signed extension.
>>  * testsuite/g++.target/powerpc/sext-elim.C: New tests.
> 
> Not a review of the patch, but we talked offline about other bugzillas
> regarding unnecessary sign and zero extensions.  Doing a quick scan, I
> see the following bugs.  Please have a look at 1) whether these are
> still a problem with unpatched trunk, and if they are, 2) whether your
> patch fixes them or could fix them.  Thanks.
> 
> https://gcc.gnu.org/PR41742

These are not addressed in the trunk patch, because int c is not initialized 
with registers and for this reason we cannot eliminate them. If we initialize 
int c then zero extension goes away.

> https://gcc.gnu.org/PR65010
> https://gcc.gnu.org/PR82940
> https://gcc.gnu.org/PR107949
>

My patch fixes these PR's which were not fixed in trunk patch.

Thanks & Regards
Ajit
 
> Peter
>

Re: [PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Ajit Agarwal via Gcc-patches

Hello Peter:

On 23/03/23 6:08 pm, Peter Bergner wrote:
> On 3/23/23 5:38 AM, Ajit Agarwal wrote:
>> This patch removed unnecessary signed extension elimination in ree pass.
>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>
>>
>> Thanks & Regards
>> Ajit
>>
>>  rtl-optimization: ppc backend generates unnecessary signed extension.
>>
>>  Eliminate unnecessary redundant signed extension.
>>
>>  2023-03-23  Ajit Kumar Agarwal  
>>
>> gcc/ChangeLog:
>>
>>  * ree.cc: Modification for  AND opcode support to eliminate
>>  unnecessary signed extension.
>>  * testsuite/g++.target/powerpc/sext-elim.C: New tests.
> 
> Not a review of the patch, but we talked offline about other bugzillas
> regarding unnecessary sign and zero extensions.  Doing a quick scan, I
> see the following bugs.  Please have a look at 1) whether these are
> still a problem with unpatched trunk, and if they are, 2) whether your
> patch fixes them or could fix them.  Thanks.
> 
> https://gcc.gnu.org/PR41742

These are not addressed in the trunk patch, because int c is not initialized 
with registers and for this reason we cannot eliminate them. If we initialize 
int c then zero extension goes away.

> https://gcc.gnu.org/PR65010
> https://gcc.gnu.org/PR82940
> https://gcc.gnu.org/PR107949
>

My patch fixes these PR's which were not fixed in trunk patch.

Thanks & Regards
Ajit
 
> Peter
>

[PATCH] tree-optimization/109262 - ICE with non-call EH and forwprop

2023-03-23 Thread Richard Biener via Gcc-patches

The recent combining of complex part loads to a complex load missed
to account for non-call EH.

Bootstrapped and tested on x86_64-unknwon-linux-gnu, pushed.

PR tree-optimization/109262
* tree-ssa-forwprop.cc (pass_forwprop::execute): When
combining a piecewise complex load avoid touching loads
that throw internally.  Use fun, not cfun throughout.

* g++.dg/torture/pr109262.C: New testcase.
---
 gcc/testsuite/g++.dg/torture/pr109262.C | 28 +
 gcc/tree-ssa-forwprop.cc| 18 +---
 2 files changed, 38 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr109262.C

diff --git a/gcc/testsuite/g++.dg/torture/pr109262.C 
b/gcc/testsuite/g++.dg/torture/pr109262.C
new file mode 100644
index 000..54323b91bf7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr109262.C
@@ -0,0 +1,28 @@
+// { dg-do compile }
+// { dg-additional-options "-fnon-call-exceptions" }
+
+template < typename > struct au;
+template < typename b > au< b > operator*(au< b >, au< b > &p2) {
+  au< b > ax;
+  ax *= p2;
+  return p2;
+}
+template <> struct au< double > {
+  double p() { return __real__ az; }
+  double q() { return __imag__ az; }
+  void operator*=(au &o) {
+_Complex bd = o.p();
+__imag__ bd = o.q();
+az *= bd;
+  }
+  _Complex az;
+};
+long bm, m;
+au< double > h;
+void bn() {
+  for (long k; ;) {
+au< double > br;
+for (long j = 0; 0 < bm; ++j)
+  au n = br * h;
+  }
+}
diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index e34f0888954..6df0b8f2215 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -3460,7 +3460,7 @@ pass_forwprop::execute (function *fun)
   lattice.create (num_ssa_names);
   lattice.quick_grow_cleared (num_ssa_names);
   int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (fun));
-  int postorder_num = pre_and_rev_post_order_compute_fn (cfun, NULL,
+  int postorder_num = pre_and_rev_post_order_compute_fn (fun, NULL,
 postorder, false);
   auto_vec to_fixup;
   auto_vec to_remove;
@@ -3594,7 +3594,7 @@ pass_forwprop::execute (function *fun)
   && !gimple_has_volatile_ops (stmt)
   && (TREE_CODE (gimple_assign_rhs1 (stmt))
   != TARGET_MEM_REF)
-  && !stmt_can_throw_internal (cfun, stmt))
+  && !stmt_can_throw_internal (fun, stmt))
{
  /* Rewrite loads used only in real/imagpart extractions to
 component-wise loads.  */
@@ -3660,7 +3660,7 @@ pass_forwprop::execute (function *fun)
   || (fun->curr_properties & PROP_gimple_lvec))
   && gimple_assign_load_p (stmt)
   && !gimple_has_volatile_ops (stmt)
-  && !stmt_can_throw_internal (cfun, stmt)
+  && !stmt_can_throw_internal (fun, stmt)
   && (!VAR_P (rhs) || !DECL_HARD_REGISTER (rhs)))
optimize_vector_load (&gsi);
 
@@ -3688,7 +3688,7 @@ pass_forwprop::execute (function *fun)
  location_t loc = gimple_location (use_stmt);
  gimple_set_location (new_stmt, loc);
  gimple_set_vuse (new_stmt, gimple_vuse (use_stmt));
- gimple_set_vdef (new_stmt, make_ssa_name (gimple_vop (cfun)));
+ gimple_set_vdef (new_stmt, make_ssa_name (gimple_vop (fun)));
  SSA_NAME_DEF_STMT (gimple_vdef (new_stmt)) = new_stmt;
  gimple_set_vuse (use_stmt, gimple_vdef (new_stmt));
  gimple_stmt_iterator gsi2 = gsi_for_stmt (use_stmt);
@@ -3718,6 +3718,8 @@ pass_forwprop::execute (function *fun)
   && (gimple_vuse (def1) == gimple_vuse (def2))
   && !gimple_has_volatile_ops (def1)
   && !gimple_has_volatile_ops (def2)
+  && !stmt_can_throw_internal (fun, def1)
+  && !stmt_can_throw_internal (fun, def2)
   && gimple_assign_rhs_code (def1) == REALPART_EXPR
   && gimple_assign_rhs_code (def2) == IMAGPART_EXPR
   && operand_equal_p (TREE_OPERAND (gimple_assign_rhs1
@@ -3752,7 +3754,7 @@ pass_forwprop::execute (function *fun)
  if (single_imm_use (lhs, &use_p, &use_stmt)
  && gimple_store_p (use_stmt)
  && !gimple_has_volatile_ops (use_stmt)
- && !stmt_can_throw_internal (cfun, use_stmt)
+ && !stmt_can_throw_internal (fun, use_stmt)
  && is_gimple_assign (use_stmt)
  && (TREE_CODE (gimple_assign_lhs (use_stmt))
  != TARGET_MEM_REF))
@@ -3783,7 +3785,7 @@ pass_forwprop::execute (function *fun)
  gimple_set_location (new_stmt, loc);
  gimple_set_vuse (new_stmt, gimple_vuse (use_stmt));

Re: [PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Ajit Agarwal via Gcc-patches




On 23/03/23 7:17 pm, Jeff Law wrote:
> 
> 
> On 3/23/23 04:38, Ajit Agarwal wrote:
>>
>> Hello All:
>>
>> This patch removed unnecessary signed extension elimination in ree pass.
>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>
>>
>> Thanks & Regards
>> Ajit
>>
>> rtl-optimization: ppc backend generates unnecessary signed extension.
>>
>> Eliminate unnecessary redundant signed extension.
>>
>> 2023-03-23  Ajit Kumar Agarwal  
>>
>> gcc/ChangeLog:
>>
>> * ree.cc: Modification for  AND opcode support to eliminate
>> unnecessary signed extension.
>> * testsuite/g++.target/powerpc/sext-elim.C: New tests.
> Just a note.  I'll look at this once the trunk is open for gcc-14 
> development.  It's really not appropriate for gcc-13.

Thanks Jeff.
> 
> jeff

[PATCH] lto/109263 - lto-wrapper and -g0 -ggdb

2023-03-23 Thread Richard Biener via Gcc-patches

The following makes lto-wrapper deal with non-combined debug
disabling / enabling option combinations properly.  Interestingly
-gno-dwarf also enables debug.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

OK?  Or do we want to try harder to zap earlier -g0 when later
-g* appear?

PR lto/109263
* lto-wrapper.c (run_gcc): Parse alternate debug options
as well, they always enable debug.
---
 gcc/lto-wrapper.cc | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc
index fe8c5f6e80d..5186d040ce0 100644
--- a/gcc/lto-wrapper.cc
+++ b/gcc/lto-wrapper.cc
@@ -1564,6 +1564,16 @@ run_gcc (unsigned argc, char *argv[])
  skip_debug = option->arg && !strcmp (option->arg, "0");
  break;
 
+   case OPT_gbtf:
+   case OPT_gctf:
+   case OPT_gdwarf:
+   case OPT_gdwarf_:
+   case OPT_ggdb:
+   case OPT_gvms:
+ /* Negative forms, if allowed, enable debug info as well.  */
+ skip_debug = false;
+ break;
+
case OPT_dumpdir:
  incoming_dumppfx = dumppfx = option->arg;
  break;
-- 
2.35.3

[PATCH] tree-optimization/107569 - avoid wrecking earlier folding in FRE/PRE

2023-03-23 Thread Richard Biener via Gcc-patches

The following avoids picking up dead code left over from folding
during FRE/PRE, effectively undoing propagations.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/107569
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt):
Do not push SSA names with zero uses as available leader.
(process_bb): Likewise.

* g++.dg/opt/pr107569.C: New testcase.
---
 gcc/testsuite/g++.dg/opt/pr107569.C | 29 +
 gcc/tree-ssa-sccvn.cc   | 17 +++--
 2 files changed, 40 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/pr107569.C

diff --git a/gcc/testsuite/g++.dg/opt/pr107569.C 
b/gcc/testsuite/g++.dg/opt/pr107569.C
new file mode 100644
index 000..e03941c7862
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr107569.C
@@ -0,0 +1,29 @@
+// { dg-do compile }
+// { dg-require-effective-target c++20 }
+// { dg-options "-O2 -fdump-tree-evrp -fdump-tree-vrp1" }
+
+namespace std
+{
+  constexpr bool isfinite (float x) { return __builtin_isfinite (x); }
+  constexpr bool isfinite (double x) { return __builtin_isfinite (x); }
+  constexpr bool isfinite (long double x) { return __builtin_isfinite (x); }
+}
+
+bool
+foo (double x)
+{
+  if (!std::isfinite (x))
+__builtin_unreachable ();
+
+  return std::isfinite (x);
+}
+
+bool
+bar (double x)
+{
+  [[assume (std::isfinite (x))]];
+  return std::isfinite (x);
+}
+
+/* { dg-final { scan-tree-dump "return 1;" "evrp" } } */
+/* { dg-final { scan-tree-dump-times "return 1;" 2 "vrp1" } } */
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index d5b081a309f..6b8d38b270c 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -7197,10 +7197,14 @@ eliminate_dom_walker::eliminate_stmt (basic_block b, 
gimple_stmt_iterator *gsi)
 }
 
   /* Make new values available - for fully redundant LHS we
- continue with the next stmt above and skip this.  */
-  def_operand_p defp;
-  FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_DEF)
-eliminate_push_avail (b, DEF_FROM_PTR (defp));
+ continue with the next stmt above and skip this.
+ But avoid picking up dead defs.  */
+  tree def;
+  FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
+if (! has_zero_uses (def)
+   || (inserted_exprs
+   && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (def
+  eliminate_push_avail (b, def);
 }
 
 /* Perform elimination for the basic-block B during the domwalk.  */
@@ -8046,9 +8050,10 @@ process_bb (rpo_elim &avail, basic_block bb,
avail.eliminate_stmt (bb, &gsi);
   else
/* If not eliminating, make all not already available defs
-  available.  */
+  available.  But avoid picking up dead defs.  */
FOR_EACH_SSA_TREE_OPERAND (op, gsi_stmt (gsi), i, SSA_OP_DEF)
- if (! avail.eliminate_avail (bb, op))
+ if (! has_zero_uses (op)
+ && ! avail.eliminate_avail (bb, op))
avail.eliminate_push_avail (bb, op);
 }
 
-- 
2.35.3

Re: [PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Peter Bergner via Gcc-patches

On 3/23/23 8:47 AM, Jeff Law wrote:
> On 3/23/23 04:38, Ajit Agarwal wrote:
>> * ree.cc: Modification for  AND opcode support to eliminate
>> unnecessary signed extension.
>> * testsuite/g++.target/powerpc/sext-elim.C: New tests.
> Just a note.  I'll look at this once the trunk is open for gcc-14 development.
> It's really not appropriate for gcc-13.

Hi Jeff, yes, we agree 100% that this is stage1 material!  I'm sorry if
this wasn't clear. 



>> https://gcc.gnu.org/PR41742
> 
> These are not addressed in the trunk patch, because int c is not initialized
> with registers and for this reason we cannot eliminate them. If we initialize
> int c then zero extension goes away.

I'm sorry that I don't know how REE works.  Why can't it optimize this?
I see in the REE dump:

(insn 20 18 22 3 (set (reg:DI 4 4)
  (zero_extend:DI (reg:QI 4 4 [orig:120 cD.3556+3 ] 
[120]))) "pr41742.c":6:41 8 {zero_extendqidi2} (nil))
(call_insn 22 20 41 3 (parallel [
(set (reg:DI 3 3)
 (call (mem:SI (symbol_ref:DI ("memset") [flags 0x41]  
) [0 memsetD.1196 S4 A8])
(const_int 64 [0x40])))
(use (const_int 0 [0]))
(clobber (reg:DI 96 lr)) ...

Is there a reason why REE cannot see that our (reg:QI 4) is a param register
and thus due to our ABI, already correctly sign/zero extended?



>> https://gcc.gnu.org/PR65010
>> https://gcc.gnu.org/PR82940
>> https://gcc.gnu.org/PR107949
>>
> 
> My patch fixes these PR's which were not fixed in trunk patch.

Great!  Once this goes is, please include these PR #s in your commit log
and mark the PRs as RESOLVED/FIXED.

That said, I see we don't enable -free at -O2 and above like other
architectures do, so we'll get no benefit without explicitly adding -free:

./gcc/common/config/riscv/riscv-common.cc:{ OPT_LEVELS_2_PLUS, OPT_free, 
NULL, 1 },
./gcc/common/config/aarch64/aarch64-common.cc:{ OPT_LEVELS_2_PLUS, 
OPT_free, NULL, 1 },
./gcc/common/config/h8300/h8300-common.cc:{ OPT_LEVELS_2_PLUS, OPT_free, 
NULL, 1 },
./gcc/common/config/i386/i386-common.cc:{ OPT_LEVELS_2_PLUS, OPT_free, 
NULL, 1 },
./gcc/common/config/sparc/sparc-common.cc:{ OPT_LEVELS_2_PLUS, OPT_free, 
NULL, 1 },
./gcc/common/config/alpha/alpha-common.cc:{ OPT_LEVELS_2_PLUS, OPT_free, 
NULL, 1 },

...maybe we should enable it too (in a separate patch) once yours goes in now
that it will actually do something for us?  Thoughts?

I'll note the docs/man page only mention x86, Aarch64 and Alpha enabling REE at
-O2 and above, but clearly others have been added since, so if we enable REE at
-O2 and above, we should fix that too.

Peter

Re: [PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Jeff Law via Gcc-patches





On 3/23/23 10:29, Peter Bergner wrote:


 https://gcc.gnu.org/PR41742


These are not addressed in the trunk patch, because int c is not initialized
with registers and for this reason we cannot eliminate them. If we initialize
int c then zero extension goes away.


I'm sorry that I don't know how REE works.  Why can't it optimize this?
I see in the REE dump:

(insn 20 18 22 3 (set (reg:DI 4 4)
   (zero_extend:DI (reg:QI 4 4 [orig:120 cD.3556+3 ] [120]))) 
"pr41742.c":6:41 8 {zero_extendqidi2} (nil))
(call_insn 22 20 41 3 (parallel [
 (set (reg:DI 3 3)
  (call (mem:SI (symbol_ref:DI ("memset") [flags 0x41]  
) [0 memsetD.1196 S4 A8])
 (const_int 64 [0x40])))
 (use (const_int 0 [0]))
 (clobber (reg:DI 96 lr)) ...

Is there a reason why REE cannot see that our (reg:QI 4) is a param register
and thus due to our ABI, already correctly sign/zero extended?
I don't think REE has ever considered exploiting ABI constraints. 
Handling that might be a notable improvement on various targets.  It'd 
be a great place to do some experimentation.


jeff

Re: [PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Peter Bergner via Gcc-patches

On 3/23/23 11:32 AM, Jeff Law via Gcc-patches wrote:
> On 3/23/23 10:29, Peter Bergner wrote:
>> I'm sorry that I don't know how REE works.  Why can't it optimize this?
>> I see in the REE dump:
>>
>> (insn 20 18 22 3 (set (reg:DI 4 4)
>>(zero_extend:DI (reg:QI 4 4 [orig:120 cD.3556+3 ] 
>> [120]))) "pr41742.c":6:41 8 {zero_extendqidi2} (nil))
>> (call_insn 22 20 41 3 (parallel [
>>  (set (reg:DI 3 3)
>>   (call (mem:SI (symbol_ref:DI ("memset") [flags 0x41]  
>> ) [0 memsetD.1196 S4 A8])
>>  (const_int 64 [0x40])))
>>  (use (const_int 0 [0]))
>>  (clobber (reg:DI 96 lr)) ...
>>
>> Is there a reason why REE cannot see that our (reg:QI 4) is a param register
>> and thus due to our ABI, already correctly sign/zero extended?
>
> I don't think REE has ever considered exploiting ABI constraints. Handling
> that might be a notable improvement on various targets.  It'd be a great
> place to do some experimentation.

Ok, so sounds like a good follow-on project after this patch is reviewed
and committed (stage1).  Thanks for your input!

Peter

[PATCH] c: [PR84900] cast of compound literal does not cause the code to become a non-lvalue

2023-03-23 Thread Andrew Pinski via Gcc-patches

The problem here is after r0-92187-g2ec5deb5c3146c, maybe_lvalue_p would
return false for compound literals which causes non_lvalue_loc not
to wrap the expression with a NON_LVALUE_EXPR unlike before when it
return true as it returns true for all language specific tree codes.

This fixes that oversight and fixes the testcase to have the cast as
a non-lvalue.

Committed to the trunk as obvious after a bootstrap/test on x86_64-linux-gnu.

PR c/84900

gcc/ChangeLog:

* fold-const.cc (maybe_lvalue_p): Treat COMPOUND_LITERAL_EXPR
as a lvalue.

gcc/testsuite/ChangeLog:

* gcc.dg/compound-literal-cast-lvalue-1.c: New test.
---
 gcc/fold-const.cc | 1 +
 gcc/testsuite/gcc.dg/compound-literal-cast-lvalue-1.c | 9 +
 2 files changed, 10 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/compound-literal-cast-lvalue-1.c

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 02a24c5fe65..5b9982e3651 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -2646,6 +2646,7 @@ maybe_lvalue_p (const_tree x)
   case LABEL_DECL:
   case FUNCTION_DECL:
   case SSA_NAME:
+  case COMPOUND_LITERAL_EXPR:
 
   case COMPONENT_REF:
   case MEM_REF:
diff --git a/gcc/testsuite/gcc.dg/compound-literal-cast-lvalue-1.c 
b/gcc/testsuite/gcc.dg/compound-literal-cast-lvalue-1.c
new file mode 100644
index 000..729bae24316
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/compound-literal-cast-lvalue-1.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c99" } */
+/* PR c/84900; casts from compound literals
+   were not considered a non-lvalue. */
+
+int main() {
+int *p = &(int) (int) {0}; /* { dg-error "lvalue" } */
+return 0;
+}
-- 
2.31.1

Re: [wwwdocs] Document support for znver4 in gcc-13/changes.html

2023-03-23 Thread Martin Jambor

Hello,

On Wed, Mar 22 2023, Richard Biener wrote:
> On Tue, Mar 21, 2023 at 8:25 PM Martin Jambor  wrote:
>>
>> Hello,
>>
>> is the following item documenting that gcc13 can generate code for Zen 4
>> OK for the changes.html file on the web?
>
> OK.

thanks.

> Note the gcc-12 changes for the upcoming 12.3 need something similar
> as most of the changes were backported.

Like this?

Thanks,

Martin



diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index d565c217..0854b83b 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -1144,6 +1144,17 @@ are not listed here).
 Note: GCC 12.3 has not been released yet, so this section is a
 work-in-progress.
 
+Target Specific Changes
+
+x86-64
+
+  GCC now supports AMD CPUs based on the znver4 core
+via -march=znver4.  The switch makes GCC consider
+using 512 bit vectors when auto-vectorizing.
+  
+
+
+
 This is the https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=RESOLVED&resolution=FIXED&target_milestone=12.3";>list
 of problem reports (PRs) from GCC's bug tracking system that are
 known to be fixed in the 12.3 release. This list might not be

[PATCH] PR tree-optimization/109238 - Ranger cache dominator queries should ignore backedges.

2023-03-23 Thread Andrew MacLeod via Gcc-patches


Detailed info in the PR.

As we walk the DOM tree to calculate ranges, any block with multiple 
predecessors is processed by evaluating and unioning incoming values. 
This catches more complex cases where the dominator node itself may not 
carry range adjustments that we care about.


What was missing was the "quick check" doesn't propagate any info. If 
the edge we check is dominated by this block (ie, its a back edge), then 
no additional useful information can be provided as it just leads back 
to where we currently are.   Only edges which are not dominated by the 
current block need be checked.


The issue arose because this "quick check" mechanism gives up on complex 
cases and returns VARYING... so any backedge would union the real value 
from the dominators with the failed result "VARYING" from that edge, and 
we get VARYING instead of the correct result.


The patch simply checks if the current block dominates the predecessor 
of an edge before launching the query.


Performance impact in negligible. slight slowdown for the check, slight 
speedup by doing less work.. its a wash.


Bootstraps on x86_64-pc-linux-gnu with no regressions.

Ok for trunk?

Andrew

PS I have not managed to produce a reduced testcase yet.. If I do I will 
supply it.




commit d54ae54c13276adbfc5b27227a3630ad40002705
Author: Andrew MacLeod 
Date:   Thu Mar 23 10:28:34 2023 -0400

Ranger cache dominator queries should ignore backedges.

When querying dominators for cache values, ignore back edges in
read-only mode.

PR tree-optimization/109238
* gimple-range-cache.cc (ranger_cache::resolve_dom): Ignore
predecessors which this block dominates.

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 7aa6a3698cd..96460ece8f4 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -1510,6 +1510,11 @@ ranger_cache::resolve_dom (vrange &r, tree name, basic_block bb)
   Value_Range er (TREE_TYPE (name));
   FOR_EACH_EDGE (e, ei, bb->preds)
 {
+  // If the predecessor is dominated by this block, then there is a back
+  // edge, and won't provide anything useful.  We'll actually end up with
+  // VARYING as we will not resolve this node.
+  if (dominated_by_p (CDI_DOMINATORS, e->src, bb))
+	continue;
   edge_range (er, e, name, RFD_READ_ONLY);
   r.union_ (er);
 }

Re: Fwd: [V5][PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-03-23 Thread Joseph Myers

On Thu, 23 Mar 2023, Qing Zhao via Gcc-patches wrote:

> gcc/c/ChangeLog:
> 
> PR tree-optimization/101832
> * c-decl.cc (finish_struct): Set TYPE_INCLUDE_FLEXARRAY for
> struct/union type.

The C front-end changes are OK (supposing the original patch has correct 
whitespace, since it seems to be messed up here).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] PR tree-optimization/109238 - Ranger cache dominator queries should ignore backedges.

2023-03-23 Thread Jakub Jelinek via Gcc-patches

On Thu, Mar 23, 2023 at 02:37:01PM -0400, Andrew MacLeod via Gcc-patches wrote:
> PS I have not managed to produce a reduced testcase yet.. If I do I will
> supply it.

Here is one:
/* PR tree-optimization/10923 */
/* { dg-do compile } */
/* { dg-options "-O2 -Wall" } */

void foo (void *) __attribute__((noreturn));
void bar (void *);

void
baz (void *p)
{
  void *c = __builtin_realloc (p, 16);
  if (c)
foo (c);
  for (;;)
bar (__builtin_realloc (p, 8)); /* { dg-bogus "pointer 'p' may be used 
after '__builtin_realloc'" } */
}

Better than what I've attached in the PR, because this one actually
doesn't contain a leak.  If first realloc fails, foo can still free it and
exit, if first realloc fails, bar will be called with result of second
realloc and can exit there too.  Oh, it would need global variable from
caller to pas p to it in case even the second realloc fails.

Jakub

[OG12][committed] Fortran/OpenMP: Fix 'alloc' and 'from' mapping for allocatable components

2023-03-23 Thread Tobias Burnus


This is about OpenMP's "deep mapping" of allocatable components of derived 
types.

The basic feature is on OG12 (and OG11) but yet in GCC mainline. The old
submissions are at 
https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593704.html

My plan is to get the whole feature into GCC 14 once trunk has opened (and
after some simpler pending patches have been merged). It requires some
re-diffing to be more digestible.

* * *

OG12: This patch as been committed to the devel/omp/gcc-12 branch as
https://gcc.gnu.org/g:a63735b8034db65a33c359633462accd9d71d3b5

* * *

This patch fixes an issue with 'map(alloc:' and 'map(from:' with
deep mapping of allocatable components - namely:

* For unmapping/coping to the host, the state of unallocated allocatables
  needs to be preservered.
* For mapping to the device ('alloc' and 'from'), we still need to copy
  data to the device to have the array bounds correctly set.

The data pointer (of allocated allocatables) is set as part of allocating
memory on the device ('attach'); thus, this part works.

As described in the patch (cf. comment above the checking function), we
could either copy only the descriptor data (and the NULL for pointers)
or we copy everything (shallowly) which includes this data. As there is
no means to do the former (without changing the refcount), we do the latter.

NOTE: The actual data to which the scalar/array allocatable points to is
not 'to' mapped but only 'alloc'. As that is supposed to be the large data,
copying everything should™ not cause a large performance penalty with real-world
code; it could be even faster than, let's say, copying 5 descriptors separately.

OpenMP spec side: It is not completely clear how the OpenMP spec expects
the copy out to work. Hence, I filed OpenMP Spec Issue #3545.
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit a63735b8034db65a33c359633462accd9d71d3b5
Author: Tobias Burnus 
Date:   Thu Mar 23 18:04:17 2023 +0100

Fortran/OpenMP: Fix 'alloc' and 'from' mapping for allocatable components

Even with 'alloc' and map-entering 'from' mapping, the following should hold.
For explicit mapping, that's already the case, this handles the automatical
deep mapping of allocatable components. Namely:
* On the device, the array bounds (of allocated allocatables) must match the
  host, implying 'to' (or 'tofrom') mapping.
* On map exiting, the copying out shall not destroy the unallocated allocation
  status (nor the pointer address of allocated allocatables).

The latter was not a problem for allocated allocatables as for those a pointer
was GOMP_MAP_ATTACHed; however, for unallocated allocatables, before it copied
back device-allocated memory which might not be nullified.

While 'alloc' was not deep-mapped at all, for map-entering 'from', the array
bounds were not set, making allocated derived-type components inaccessible on
the device (and wrong on the host on copy back).

The solution is, first, to deep-map 'alloc' as well and to copy to the device
even with 'alloc' and (map-entering) 'from'. This copying is only done if there
is a scalar (for the unallocated case) or array allocatable directly in the
derived type and then it is shallowly copied; the data pointed to is then again
only alloc'ed, unless it contains in turn allocatables.

gcc/fortran/

* trans-openmp.cc (gfc_has_alloc_comps): Add 'bool
shallow_alloc_only=false' arg.
(gfc_omp_replace_alloc_by_to_mapping): New, call it.
(gfc_omp_deep_map_kind_p): Return 'true' also for '(present,)alloc'.
(gfc_omp_deep_mapping_item, gfc_omp_deep_mapping_do): On map entering,
replace shallowly 'alloc'/'from' by '(from)to' mapping if there are
allocatable components.

libgomp/

* testsuite/libgomp.fortran/map-alloc-comp-8.f90: New test.
---
 gcc/fortran/ChangeLog.omp  |  10 +
 gcc/fortran/trans-openmp.cc|  96 +++-
 libgomp/ChangeLog.omp  |   4 +
 .../testsuite/libgomp.fortran/map-alloc-comp-8.f90 | 268 +
 4 files changed, 371 insertions(+), 7 deletions(-)

diff --git a/gcc/fortran/ChangeLog.omp b/gcc/fortran/ChangeLog.omp
index f7d1f91f178..e3ab2254215 100644
--- a/gcc/fortran/ChangeLog.omp
+++ b/gcc/fortran/ChangeLog.omp
@@ -1,3 +1,13 @@
+2023-03-23  Tobias Burnus  
+
+	* trans-openmp.cc (gfc_has_alloc_comps): Add 'bool
+	shallow_alloc_only=false' arg.
+	(gfc_omp_replace_alloc_by_to_mapping): New, call it.
+	(gfc_omp_deep_map_kind_p): Return 'true' also for '(present,)alloc'.
+	(gfc_omp_deep_mapping_item, gfc_omp_deep_mapping_do): On map entering,
+	repla

Re: Fwd: [V5][PATCH 2/2] Update documentation to clarify a GCC extension

2023-03-23 Thread Joseph Myers

On Thu, 23 Mar 2023, Qing Zhao via Gcc-patches wrote:

> +Wgnu-variable-sized-type-not-at-end
> +C C++ Var(warn_variable_sized_type_not_at_end) Warning
> +Warn about structures or unions with C99 flexible array members are not
> +at the end of a structure.

I think there's at least one word missing here, e.g. "that" before "are".

> +Please use warning option  @option{-Wgnu-variable-sized-type-not-at-end} to
> +identify all such cases in the source code and modify them.  This extension
> +will be deprecated from gcc in the next release.

We don't generally say "in the next release" in the manual (or "deprecated 
from gcc").  Maybe it *is* deprecated, maybe it will be *removed*, or will 
*start to warn by default*, in some specified version number (giving a 
version number seems better than "next release"), but "will be deprecated" 
is odd.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH RFC] c-family: -Wsequence-point and COMPONENT_REF [PR107163]

2023-03-23 Thread Jason Merrill via Gcc-patches

Tested x86_64-pc-linux-gnu.  Jakub, does this make sense to you?  Do we have a
way of testing for compile-hog regressions?

-- 8< --

The patch for PR91415 fixed -Wsequence-point to treat shifts and ARRAY_REF
as sequenced in C++17, and COMPONENT_REF as well.  But this is unnecessary
for COMPONENT_REF, since the RHS is just a FIELD_DECL with no actual
evaluation, and in this testcase handling COMPONENT_REF as sequenced blows
up fast in a deep inheritance tree.

PR c++/107163

gcc/c-family/ChangeLog:

* c-common.cc (verify_tree): Don't use sequenced handling
for COMPONENT_REF.
---
 gcc/c-family/c-common.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index bfb950e56db..a803cf94c68 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -2154,7 +2154,6 @@ verify_tree (tree x, struct tlist **pbefore_sp, struct 
tlist **pno_sp,
 
 case LSHIFT_EXPR:
 case RSHIFT_EXPR:
-case COMPONENT_REF:
 case ARRAY_REF:
   if (cxx_dialect >= cxx17)
goto sequenced_binary;

base-commit: 4872e46e080c6695dfe1f9dc9db26b4703bc348c
-- 
2.31.1

[COMMITTED] testsuite: Xfail gcc.dg/tree-ssa/ssa-fre-100.c for ! natural_alignment_32

2023-03-23 Thread Hans-Peter Nilsson via Gcc-patches

Tested native x86_64-linux and cris-elf.  The "recent patch
to gcc.dg/tree-ssa/pr100359.c" refers to r13-6838.
Committed as obvious after that commit.
-- >8 --
The test gcc.dg/tree-ssa/ssa-fre-100.c fails the
scan-tree-dump-not fre1 "baz" for at least m68k-linux,
pru-elf, and cris-elf according to posts on gcc-testresults.

GCC requires int-size-alignment for a target to see through
the "int *" dereference and perform value-numbering.  See
comments in PR91419 and also the recent patch to
gcc.dg/tree-ssa/pr100359.c  This is a flaw in gcc rather
than the target, so prefer an xfail rather than skipping
the test.

* gcc.dg/tree-ssa/ssa-fre-100.c: XFAIL for ! natural_alignment_32.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
index ead76548f3df..1b6a3a398a4e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
@@ -22,4 +22,4 @@ void foo (int *p, int n)
   while (--n);
 }
 
-/* { dg-final { scan-tree-dump-not "baz" "fre1" } } */
+/* { dg-final { scan-tree-dump-not "baz" "fre1" { xfail { ! 
natural_alignment_32 } } } } */
-- 
2.30.2

Re: [PATCH RFC] c-family: -Wsequence-point and COMPONENT_REF [PR107163]

2023-03-23 Thread Jakub Jelinek via Gcc-patches

On Thu, Mar 23, 2023 at 04:35:07PM -0400, Jason Merrill wrote:
> Tested x86_64-pc-linux-gnu.  Jakub, does this make sense to you?  Do we have a
> way of testing for compile-hog regressions?
> 
> -- 8< --
> 
> The patch for PR91415 fixed -Wsequence-point to treat shifts and ARRAY_REF
> as sequenced in C++17, and COMPONENT_REF as well.  But this is unnecessary
> for COMPONENT_REF, since the RHS is just a FIELD_DECL with no actual
> evaluation, and in this testcase handling COMPONENT_REF as sequenced blows
> up fast in a deep inheritance tree.
> 
>   PR c++/107163
> 
> gcc/c-family/ChangeLog:
> 
>   * c-common.cc (verify_tree): Don't use sequenced handling
>   for COMPONENT_REF.

When we touch this for COMPONENT_REF, shouldn't we then handle it as
unary given that the second operand is FIELD_DECL and third/fourth
will likely be NULL and even if not, aren't user expressions that should be
inspected?
So, instead of doing this do:
case COMPONENT_REF:
  x = TREE_OPERAND (x, 0);
  writer = 0;
  goto restart;
?

As for compile-hog, depends on how long it will take it to compile before
fix/after fix.  If before fix can be above the normal timeout on reasonably
fast matchines and after fix can take a few seconds, great, if after fix
would take longer but still not horribly long, one way to do it is
guard the test with run_expensive_tests effective target.  Or another way
is have the test smaller in complexity normally and
// { dg-additional-options "-DEXPENSIVE" { target run_expensive_tests } } 
and #ifdef EXPENSIVE make it more complex.

Jakub

[PATCH 1/2] c++: improve "NTTP argument considered unused" fix [PR53164, PR105848]

2023-03-23 Thread Patrick Palka via Gcc-patches

r13-995-g733a792a2b2e16 worked around the problem of FUNCTION_DECL
template arguments not always getting marked as odr-used by redundantly
calling mark_used on the substituted ADDR_EXPR callee of a CALL_EXPR.
This is just a narrow workaround however, since using a FUNCTION_DECL as
a template argument alone should constitutes an odr-use; we shouldn't
need to subsequently e.g. call the function or take its address.

This patch fixes this in a more general way at template specialization
time by walking the template arguments of the specialization and calling
mark_used on all entities used within.  As before, the call to mark_used
as it worst a no-op, but it compensates for the situation where we end up
forming a specialization from a template context in which mark_used is
inhibited.  Another approach would be to call mark_used whenever we
substitute a TEMPLATE_PARM_INDEX, but that would result in many more
redundant calls to mark_used compared to this approach.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/53164
PR c++/105848

gcc/cp/ChangeLog:

* pt.cc (instantiate_class_template): Call
mark_template_arguments_used.
(tsubst_copy_and_build) : Revert r13-995 change.
(mark_template_arguments_used): Define.
(instantiate_template): Call mark_template_arguments_used.

gcc/testsuite/ChangeLog:

* g++.dg/template/fn-ptr3a.C: New test.
* g++.dg/template/fn-ptr4.C: New test.
---
 gcc/cp/pt.cc | 51 
 gcc/testsuite/g++.dg/template/fn-ptr3a.C | 25 
 gcc/testsuite/g++.dg/template/fn-ptr4.C  | 14 +++
 3 files changed, 74 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/fn-ptr3a.C
 create mode 100644 gcc/testsuite/g++.dg/template/fn-ptr4.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 7e4a8de0c8b..9b3cc1c 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -220,6 +220,7 @@ static tree make_argument_pack (tree);
 static tree enclosing_instantiation_of (tree tctx);
 static void instantiate_body (tree pattern, tree args, tree d, bool nested);
 static tree maybe_dependent_member_ref (tree, tree, tsubst_flags_t, tree);
+static void mark_template_arguments_used (tree);
 
 /* Make the current scope suitable for access checking when we are
processing T.  T can be FUNCTION_DECL for instantiated function
@@ -12142,6 +12143,9 @@ instantiate_class_template (tree type)
   cp_unevaluated_operand = 0;
   c_inhibit_evaluation_warnings = 0;
 }
+
+  mark_template_arguments_used (INNERMOST_TEMPLATE_ARGS (args));
+
   /* Use #pragma pack from the template context.  */
   saved_maximum_field_alignment = maximum_field_alignment;
   maximum_field_alignment = TYPE_PRECISION (pattern);
@@ -21173,22 +21177,10 @@ tsubst_copy_and_build (tree t,
  }
 
/* Remember that there was a reference to this entity.  */
-   if (function != NULL_TREE)
- {
-   tree inner = function;
-   if (TREE_CODE (inner) == ADDR_EXPR
-   && TREE_CODE (TREE_OPERAND (inner, 0)) == FUNCTION_DECL)
- /* We should already have called mark_used when taking the
-address of this function, but do so again anyway to make
-sure it's odr-used: at worst this is a no-op, but if we
-obtained this FUNCTION_DECL as part of ahead-of-time overload
-resolution then that call to mark_used wouldn't have marked it
-odr-used yet (53164).  */
- inner = TREE_OPERAND (inner, 0);
-   if (DECL_P (inner)
-   && !mark_used (inner, complain) && !(complain & tf_error))
- RETURN (error_mark_node);
- }
+   if (function != NULL_TREE
+   && DECL_P (function)
+   && !mark_used (function, complain) && !(complain & tf_error))
+ RETURN (error_mark_node);
 
if (!maybe_fold_fn_template_args (function, complain))
  return error_mark_node;
@@ -21883,6 +21875,31 @@ check_instantiated_args (tree tmpl, tree args, 
tsubst_flags_t complain)
   return result;
 }
 
+/* Call mark_used on each entity within the template arguments ARGS of some
+   template specialization, to ensure that each such entity is considered
+   odr-used regardless of whether the specialization was first formed in a
+   template context.
+
+   This function assumes push_to_top_level has been called beforehand, and
+   that processing_template_decl has been set iff the template arguments
+   are dependent.  */
+
+static void
+mark_template_arguments_used (tree args)
+{
+  gcc_checking_assert (TMPL_ARGS_DEPTH (args) == 1);
+
+  if (processing_template_decl)
+return;
+
+  auto mark_used_r = [](tree *tp, int *, void *) {
+if (DECL_P (*tp))
+  mark_used (*tp, tf_none);
+return NULL_TREE;
+  };
+  cp_walk_tree_without_duplicates (&args, mark_used_r, nullptr);
+}
+
 /* We're out o

[PATCH 2/2] c++: duplicate "use of deleted fn" diagnostic [PR106880]

2023-03-23 Thread Patrick Palka via Gcc-patches

Here we're issuing a duplicate diagnostic for the use of the deleted
foo, first from the CALL_EXPR case of tsubst_copy_and_build (which
doesn't exit early upon failure), and again from from build_over_call
when rebuilding the substituted CALL_EXPR.

We can fix this by exiting early upon failure of the first call, but
this first call should always be redundant since build_over_call (or
another subroutine of finish_call_expr) ought to reliably call mark_used
for a suitable DECL_P callee anyway.

So this patch just gets rid of the first call to mark_used.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/106880

gcc/cp/ChangeLog:

* pt.cc (tsubst_copy_and_build) : Don't call
mark_used.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/deleted16.C: New test.
---
 gcc/cp/pt.cc   |  6 --
 gcc/testsuite/g++.dg/cpp0x/deleted16.C | 11 +++
 2 files changed, 11 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/deleted16.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 9b3cc1c..060d2d38504 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -21176,12 +21176,6 @@ tsubst_copy_and_build (tree t,
  }
  }
 
-   /* Remember that there was a reference to this entity.  */
-   if (function != NULL_TREE
-   && DECL_P (function)
-   && !mark_used (function, complain) && !(complain & tf_error))
- RETURN (error_mark_node);
-
if (!maybe_fold_fn_template_args (function, complain))
  return error_mark_node;
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/deleted16.C 
b/gcc/testsuite/g++.dg/cpp0x/deleted16.C
new file mode 100644
index 000..93cfb51eb3d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/deleted16.C
@@ -0,0 +1,11 @@
+// PR c++/106880
+// Verify we don't emit a "use of deleted function" diagnostic twice.
+// { dg-do compile { target c++11 } }
+
+void foo() = delete;
+
+template
+void f(T t) { foo(t); } // { dg-bogus "deleted function.*deleted function" }
+// { dg-error "deleted function" "" { target *-*-*} 
.-1 }
+
+template void f(int);
-- 
2.40.0.130.g27d43aaaf5

Re: [PATCH 1/2] c++: improve "NTTP argument considered unused" fix [PR53164, PR105848]

2023-03-23 Thread Patrick Palka via Gcc-patches

On Thu, Mar 23, 2023 at 5:18 PM Patrick Palka  wrote:
>
> r13-995-g733a792a2b2e16 worked around the problem of FUNCTION_DECL
> template arguments not always getting marked as odr-used by redundantly
> calling mark_used on the substituted ADDR_EXPR callee of a CALL_EXPR.
> This is just a narrow workaround however, since using a FUNCTION_DECL as
> a template argument alone should constitutes an odr-use; we shouldn't
> need to subsequently e.g. call the function or take its address.
>
> This patch fixes this in a more general way at template specialization
> time by walking the template arguments of the specialization and calling
> mark_used on all entities used within.  As before, the call to mark_used
> as it worst a no-op, but it compensates for the situation where we end up
> forming a specialization from a template context in which mark_used is
> inhibited.  Another approach would be to call mark_used whenever we
> substitute a TEMPLATE_PARM_INDEX, but that would result in many more
> redundant calls to mark_used compared to this approach.

Note we previously discussed this TEMPLATE_PARM_INDEX approach
here https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596257.html
though I never pushed it since it felt somewhat overkill to me every single
substituted use of an NTTP would be considered for marking;
 perhaps this approach might be preferable? Yet another approach would
be to do it from tsubst_template_args..

>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?
>
> PR c++/53164
> PR c++/105848
>
> gcc/cp/ChangeLog:
>
> * pt.cc (instantiate_class_template): Call
> mark_template_arguments_used.
> (tsubst_copy_and_build) : Revert r13-995 change.
> (mark_template_arguments_used): Define.
> (instantiate_template): Call mark_template_arguments_used.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/template/fn-ptr3a.C: New test.
> * g++.dg/template/fn-ptr4.C: New test.
> ---
>  gcc/cp/pt.cc | 51 
>  gcc/testsuite/g++.dg/template/fn-ptr3a.C | 25 
>  gcc/testsuite/g++.dg/template/fn-ptr4.C  | 14 +++
>  3 files changed, 74 insertions(+), 16 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/template/fn-ptr3a.C
>  create mode 100644 gcc/testsuite/g++.dg/template/fn-ptr4.C
>
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 7e4a8de0c8b..9b3cc1c 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -220,6 +220,7 @@ static tree make_argument_pack (tree);
>  static tree enclosing_instantiation_of (tree tctx);
>  static void instantiate_body (tree pattern, tree args, tree d, bool nested);
>  static tree maybe_dependent_member_ref (tree, tree, tsubst_flags_t, tree);
> +static void mark_template_arguments_used (tree);
>
>  /* Make the current scope suitable for access checking when we are
> processing T.  T can be FUNCTION_DECL for instantiated function
> @@ -12142,6 +12143,9 @@ instantiate_class_template (tree type)
>cp_unevaluated_operand = 0;
>c_inhibit_evaluation_warnings = 0;
>  }
> +
> +  mark_template_arguments_used (INNERMOST_TEMPLATE_ARGS (args));
> +
>/* Use #pragma pack from the template context.  */
>saved_maximum_field_alignment = maximum_field_alignment;
>maximum_field_alignment = TYPE_PRECISION (pattern);
> @@ -21173,22 +21177,10 @@ tsubst_copy_and_build (tree t,
>   }
>
> /* Remember that there was a reference to this entity.  */
> -   if (function != NULL_TREE)
> - {
> -   tree inner = function;
> -   if (TREE_CODE (inner) == ADDR_EXPR
> -   && TREE_CODE (TREE_OPERAND (inner, 0)) == FUNCTION_DECL)
> - /* We should already have called mark_used when taking the
> -address of this function, but do so again anyway to make
> -sure it's odr-used: at worst this is a no-op, but if we
> -obtained this FUNCTION_DECL as part of ahead-of-time overload
> -resolution then that call to mark_used wouldn't have marked 
> it
> -odr-used yet (53164).  */
> - inner = TREE_OPERAND (inner, 0);
> -   if (DECL_P (inner)
> -   && !mark_used (inner, complain) && !(complain & tf_error))
> - RETURN (error_mark_node);
> - }
> +   if (function != NULL_TREE
> +   && DECL_P (function)
> +   && !mark_used (function, complain) && !(complain & tf_error))
> + RETURN (error_mark_node);
>
> if (!maybe_fold_fn_template_args (function, complain))
>   return error_mark_node;
> @@ -21883,6 +21875,31 @@ check_instantiated_args (tree tmpl, tree args, 
> tsubst_flags_t complain)
>return result;
>  }
>
> +/* Call mark_used on each entity within the template arguments ARGS of some
> +   template specialization, to ensure that each such entity is considered
> +   odr-used regardless of whether the

[pushed] c++: constexpr PMF conversion [PR105996]

2023-03-23 Thread Jason Merrill via Gcc-patches

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

Here, we were calling build_reinterpret_cast regardless of whether there was
actually a cast, and that now sets REINTERPRET_CAST_P.  But that
optimization seems dodgy anyway, as it involves NOP_EXPR from one
RECORD_TYPE to another and we try to reserve NOP_EXPR for fundamental types.
And the generated code seems the same, so let's drop it.  And also strip
location wrappers.

PR c++/105996

gcc/cp/ChangeLog:

* typeck.cc (build_ptrmemfunc): Drop 0-offset optimization
and location wrappers.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-pmf3.C: New test.
---
 gcc/cp/typeck.cc| 13 +
 gcc/testsuite/g++.dg/cpp0x/constexpr-pmf3.C | 13 +
 2 files changed, 18 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-pmf3.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index afb956087ce..8b60cbbc167 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -9960,18 +9960,15 @@ build_ptrmemfunc (tree type, tree pfn, int force, bool 
c_cast_p,
   if (n == error_mark_node)
return error_mark_node;
 
+  STRIP_ANY_LOCATION_WRAPPER (pfn);
+
   /* We don't have to do any conversion to convert a
 pointer-to-member to its own type.  But, we don't want to
 just return a PTRMEM_CST if there's an explicit cast; that
 cast should make the expression an invalid template argument.  */
-  if (TREE_CODE (pfn) != PTRMEM_CST)
-   {
- if (same_type_p (to_type, pfn_type))
-   return pfn;
- else if (integer_zerop (n) && TREE_CODE (pfn) != CONSTRUCTOR)
-   return build_reinterpret_cast (input_location, to_type, pfn, 
-   complain);
-   }
+  if (TREE_CODE (pfn) != PTRMEM_CST
+ && same_type_p (to_type, pfn_type))
+   return pfn;
 
   if (TREE_SIDE_EFFECTS (pfn))
pfn = save_expr (pfn);
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-pmf3.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-pmf3.C
new file mode 100644
index 000..14daea312b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-pmf3.C
@@ -0,0 +1,13 @@
+// PR c++/105996
+// { dg-do compile { target c++11 } }
+
+struct A {
+  void CB() {}
+};
+struct B : public A { };
+
+using APMF = void (A::*)();
+using BPMF = void (B::*)();
+
+constexpr APMF foo () { return &A::CB; };
+static constexpr BPMF b = foo();

base-commit: 3fbeff66684d95417646aaa22d0a8f1ec9786299
-- 
2.31.1

[PATCH] Fix native MSYS2 build failure [PR108865, PR109188]

2023-03-23 Thread Costas Argyris via Gcc-patches

Patch to fix native Windows MSYS2 package build failures.

The patch has been confirmed by both original reporters in the
PRs, and myself.

cc'd mingw-w64 maintainer for mingw-w64-specific issue.


0001-Fix-native-MSYS2-build-failure-PR108865-PR109188.patch
Description: Binary data

Re: [PATCH] rtl-optimization: ppc backend generates unnecessary signed extension.

2023-03-23 Thread Jeff Law via Gcc-patches





On 3/23/23 10:53, Peter Bergner wrote:

On 3/23/23 11:32 AM, Jeff Law via Gcc-patches wrote:

On 3/23/23 10:29, Peter Bergner wrote:

I'm sorry that I don't know how REE works.  Why can't it optimize this?
I see in the REE dump:

(insn 20 18 22 3 (set (reg:DI 4 4)
(zero_extend:DI (reg:QI 4 4 [orig:120 cD.3556+3 ] [120]))) 
"pr41742.c":6:41 8 {zero_extendqidi2} (nil))
(call_insn 22 20 41 3 (parallel [
  (set (reg:DI 3 3)
   (call (mem:SI (symbol_ref:DI ("memset") [flags 0x41]  
) [0 memsetD.1196 S4 A8])
  (const_int 64 [0x40])))
  (use (const_int 0 [0]))
  (clobber (reg:DI 96 lr)) ...

Is there a reason why REE cannot see that our (reg:QI 4) is a param register
and thus due to our ABI, already correctly sign/zero extended?


I don't think REE has ever considered exploiting ABI constraints. Handling
that might be a notable improvement on various targets.  It'd be a great
place to do some experimentation.


Ok, so sounds like a good follow-on project after this patch is reviewed
and committed (stage1).  Thanks for your input!Agreed.  I suspect that risc-v will benefit from such work as well. 
With that in mind, if y'all start poking at this, please loop in Raphael 
(on cc) who's expressed an interest in this space.


Jeff

Re: [PATCH] vect: Check that vector factor is a compile-time constant

2023-03-23 Thread Jeff Law via Gcc-patches





On 3/17/23 10:57, Palmer Dabbelt wrote:



I'm a little bit confused about what the proposal is here: is the idea 
to have a branch based on gcc-13 where we coordinate work before it 
lands on trunk, or a branch based on gcc-13 where we backport 
autovec-related patches once they've landed on trunk?  In my mind those 
are actually two different things and I think they're both useful, maybe 
we should just do both?
I was thinking it was a branch to coordinate backports.  We could also 
have a branch to coordinate development before it lands on the trunk.



The former provides a base for those who might want a stable gcc-13 
based compiler, but with RVV support.  The latter is more focused on 
ongoing development.






That implies we need to identify the principals.  I'll suggest Kito,
Juzhe, Michael and myself as the initial list.  I'm certainly open to
others joining.


+Vineet, who's been handling our internal GCC branches.

OK.




Sorry if that throws a bit of a wrench in the works.

No worries at all.



Just for context: in Rivos land we don't have any specific timelines 
around 13, so the goal on our end is just to keep the vectorization 
stuff progressing smoothly as we spin up more engineering resources on 
it.  Our aim is just to get everything on trunk eventually, anything 
else is just a stop-gap and we can work around it (though sharing that 
work is always a win).
We don't have hard time lines (yet), but I can work backwards from 
various plans and conclude that Ventana will need a gcc-13 with vector 
backports, hence my original focus on that aspect of the coordination 
problem.


Thanks for raising the need for a development coordination branch.

jeff

Re: [PATCH v1] [RFC] Improve folding for comparisons with zero in tree-ssa-forwprop.

2023-03-23 Thread Jeff Law via Gcc-patches





On 3/20/23 08:01, Manolis Tsamis wrote:

On Fri, Mar 17, 2023 at 10:31 AM Richard Biener
 wrote:


On Thu, Mar 16, 2023 at 4:27 PM Manolis Tsamis  wrote:


For this C testcase:

void g();
void f(unsigned int *a)
{
   if (++*a == 1)
 g();
}

GCC will currently emit a comparison with 1 by using the value
of *a after the increment. This can be improved by comparing
against 0 and using the value before the increment. As a result
there is a potentially shorter dependancy chain (no need to wait
for the result of +1) and on targets with compare zero instructions
the generated code is one instruction shorter.


The downside is we now need two registers and their lifetime overlaps.

Your patch mixes changing / inverting a parameter (which seems unneeded
for the actual change) with preferring compares against zero.



Indeed. I thought that without that change the original names wouldn't properly
describe what the parameter actually does and that's why I've changed it.
I can undo that in the next revision.
Typically the thing to do is send that separately.  If it has no 
functional change, then it can often go in immediately.






What's the reason to specifically prefer compares against zero?  On x86
we have add that sets flags, so ++*a == 0 would be preferred, but
for your sequence we'd need a test reg, reg; branch on zero, so we do
not save any instruction.



My reasoning is that zero is treated preferentially  in most if not
all architectures. Some specifically have zero/non-zero comparisons so
we get one less instruction. X86 doesn't explicitly have that but I
think that test reg, reg may not be always needed depending on the
rest of the code. By what Andrew mentions below there may even be
optimizations for zero in the microarchitecture level.
There's all kinds of low level ways a test against zero is better than a 
test against any other value.  I'm not aware of any architecture were 
the opposite is true.


Note that in this specific case rewriting does cause us to need two 
registers, so we'll want to think about the right time to make this 
transformation.  It may be the case that doing it in gimple is too early.






Because this is still an arch-specific thing I initially tried to make
it arch-depended by invoking the target's const functions (e.g. If I
recall correctly aarch64 will return a lower cost for zero
comparisons). But the code turned out complicated and messy so I came
up with this alternative that just treats zero preferentially.

If you have in mind a way that this can be done in a better way I
could try to implement it.
And in general I think you approached this in the preferred way -- it's 
largely a target independent optimization, so let's tackle it in a 
generic way.


Anyway, we'll dive into it once gcc-14 development opens and try to 
figure out the best way forward.


jeff

[PATCH] RISC-V: Optimize load memory data in rv64

2023-03-23 Thread Feng Wang

This patch optimize load one byte or halfword from memory in rv64.
Please refer to the following test case for loading one byte.
int sextb32_memory(int* x)
{ return (*x << 24) >> 24; }

The build flags are "-march=rv64g -mabi=lp64d -O2"
The current compilation results are as follows,

slliw a0,a0,0x18
sraiw a0,a0,0x18
ret

The compilation results after picking this patch are as follows,
lb a0,0(a0)
ret

The iusse is introduced by this patch
"RISC-V: Avoid zero/sign extend for volatile loads. Fix for 97417."
This patch expand
(set (reg:QI/HI/SI target) (mem:QI/HI/SI (address)))
to
(set (reg:DI temp) (zero_extend:DI (mem:QI/HI/SI (address
(set (reg:QI/HI/SI target) (subreg:QI/HI/SI (reg:DI temp) 0))
There is no problem with this transformation for QI and HI.
However,it will affect the subsequent combine processing for SI.
So I modified this operation to only take effect for QI and HI.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move):Modify length judgment

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rv64-load-byte.c: New test.
* gcc.target/riscv/rv64-load-halfword.c: New test.
---
 gcc/config/riscv/riscv.cc   | 2 +-
 gcc/testsuite/gcc.target/riscv/rv64-load-byte.c | 8 
 gcc/testsuite/gcc.target/riscv/rv64-load-halfword.c | 8 
 3 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv64-load-byte.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv64-load-halfword.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 1db12091b5a..4b596c7bb5b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2074,7 +2074,7 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
(set (reg:QI target) (subreg:QI (reg:DI temp) 0))
  with auto-sign/zero extend.  */
   if (GET_MODE_CLASS (mode) == MODE_INT
-  && GET_MODE_SIZE (mode).to_constant () < UNITS_PER_WORD
+  && GET_MODE_SIZE (mode).to_constant () < MIN_UNITS_PER_WORD
   && can_create_pseudo_p ()
   && MEM_P (src))
 {
diff --git a/gcc/testsuite/gcc.target/riscv/rv64-load-byte.c 
b/gcc/testsuite/gcc.target/riscv/rv64-load-byte.c
new file mode 100644
index 000..929aac79993
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv64-load-byte.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int sextb32_memory(int* x)
+{ return (*x << 24) >> 24; }
+
+/* { dg-final { scan-assembler "lb" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rv64-load-halfword.c 
b/gcc/testsuite/gcc.target/riscv/rv64-load-halfword.c
new file mode 100644
index 000..94e1bd7e135
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv64-load-halfword.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int sexth32_memory(int* x)
+{ return (*x << 16) >> 16; }
+
+/* { dg-final { scan-assembler "lh" } } */
-- 
2.17.1

[PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64

2023-03-23 Thread Feng Wang

This patch optimize the combine processing for sext.b/h in rv64.
Please refer to the following test case,
int sextb32(int x)
{ return (x << 24) >> 24; }

The rtl expression is as follows,
(insn 6 3 7 2 (set (reg:SI 138)
(ashift:SI (subreg/s/u:SI (reg/v:DI 136 [ xD.2271 ]) 0)
(const_int 24 [0x18]))) "sextb.c":2:13 195 {ashlsi3}
 (expr_list:REG_DEAD (reg/v:DI 136 [ xD.2271 ])
(nil)))
(insn 7 6 8 2 (set (reg:SI 137)
(ashiftrt:SI (reg:SI 138)
(const_int 24 [0x18]))) "sextb.c":2:20 196 {ashrsi3}
 (expr_list:REG_DEAD (reg:SI 138)
(nil)))

During the combine phase, they will combine into
(set (reg:SI 137)
(ashiftrt:SI (subreg:SI (ashift:DI (reg:DI 140)
(const_int 24 [0x18])) 0)
(const_int 24 [0x18])))

The optimal combine result is
(set (reg:SI 137)
(sign_extend:SI (subreg:QI (reg:DI 140) 0)))
This can be converted to the sext ins.

Due to the influence of subreg,the current processing
can't obtain the imm of left shifts. Need to peel off
another layer of rtl to obtain it.

gcc/ChangeLog:

* combine.cc (extract_left_shift): Add SUBREG case.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbb-sext-rv64.c: New test.
---
 gcc/combine.cc |  5 +
 gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c | 12 
 2 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 053879500b7..fb396a3d974 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -7915,6 +7915,11 @@ extract_left_shift (scalar_int_mode mode, rtx x, int 
count)
 
   switch (code)
 {
+case SUBREG:
+  x = XEXP (x, 0);
+  if (GET_CODE(x) != ASHIFT)
+break;
+
 case ASHIFT:
   /* This is the shift itself.  If it is wide enough, we will return
 either the value being shifted if the shift count is equal to
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c 
b/gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c
new file mode 100644
index 000..4086ee56f57
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g_zbb -mabi=lp64d -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int sextb32(int x)
+{ return (x << 24) >> 24; }
+
+int sexth32(int x)
+{ return (x << 16) >> 16; }
+
+/* { dg-final { scan-assembler "sext.b" } } */
+/* { dg-final { scan-assembler "sext.h" } } */
\ No newline at end of file
-- 
2.17.1

RISC-V: Optimize zbb ins sext.b and sext.h in rv64

2023-03-23 Thread juzhe.zh...@rivai.ai

Sounds like you are looking at redundant extension problem in RISC-V port.
This is the issue I want to fix but I don't find the time to do that.
My first impression is that we need to fix redundant extension in "ree" PASS.
I am not sure.

Base on you are looking at this kind of issues, would you mind looking at this 
issue?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108016 

Thanks.


juzhe.zh...@rivai.ai

[PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64

2023-03-23 Thread juzhe.zh...@rivai.ai

Sounds like you are looking at redundant extension problem in RISC-V port.
This is the issue I want to fix but I don't find the time to do that.
My first impression is that we need to fix redundant extension in "ree" PASS.
I am not sure.

Base on you are looking at this kind of issues, would you mind looking at this 
issue?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108016 

Thanks.



juzhe.zh...@rivai.ai

Re: [PATCH] vect: Check that vector factor is a compile-time constant

2023-03-23 Thread Palmer Dabbelt


On Thu, 23 Mar 2023 16:18:20 PDT (-0700), jeffreya...@gmail.com wrote:



On 3/17/23 10:57, Palmer Dabbelt wrote:



I'm a little bit confused about what the proposal is here: is the idea
to have a branch based on gcc-13 where we coordinate work before it
lands on trunk, or a branch based on gcc-13 where we backport
autovec-related patches once they've landed on trunk?  In my mind those
are actually two different things and I think they're both useful, maybe
we should just do both?

I was thinking it was a branch to coordinate backports.  We could also
have a branch to coordinate development before it lands on the trunk.


The former provides a base for those who might want a stable gcc-13
based compiler, but with RVV support.  The latter is more focused on
ongoing development.


Yep, just two different things.


That implies we need to identify the principals.  I'll suggest Kito,
Juzhe, Michael and myself as the initial list.  I'm certainly open to
others joining.


+Vineet, who's been handling our internal GCC branches.

OK.




Sorry if that throws a bit of a wrench in the works.

No worries at all.



Just for context: in Rivos land we don't have any specific timelines
around 13, so the goal on our end is just to keep the vectorization
stuff progressing smoothly as we spin up more engineering resources on
it.  Our aim is just to get everything on trunk eventually, anything
else is just a stop-gap and we can work around it (though sharing that
work is always a win).

We don't have hard time lines (yet), but I can work backwards from
various plans and conclude that Ventana will need a gcc-13 with vector
backports, hence my original focus on that aspect of the coordination
problem.


OK.  We don't have a hard need there, but it'll make life easier so I'm 
happy to just treat it like a real shipping branch if you guys are going 
to as well.


Are you OK just having a single "gcc-13 with RISC-V performance 
backports" branch, or do you want just vector backports?  Our internal 
branch would be all performance-related backports, but no big deal if 
the upstream stuff is vector-only as that's probably going to be 90%+ of 
the churn.



Thanks for raising the need for a development coordination branch.


I guess "need" is kind of strong: IMO it's up to the people actually 
doing the work how to organize the branches.  I'm not writing the code 
here so I'm happy with whatever, just pointing out that there's two 
different things that could be done ;)



jeff

Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64

2023-03-23 Thread Feng Wang

Hi Juzhe,

Thank you for your reply, I'm really doing some optimization work right now.
I am very interested in the question you have raised, and I will take the time 
to try to optimize it.
I hope I can communicate with you and learn from you more in the future.

Thanks.

Re: [PATCH][stage1] Remove conditionals around free()

2023-03-23 Thread NightStrike via Gcc-patches

On Fri, Mar 3, 2023 at 10:14 PM Jerry D via Fortran  wrote:

> I am certainly not a C++ expert but it seems to me this all begs for
> automatic finalization where one would not have to invoke free at all.
> I suspect the gfortran frontend is not designed for such things.

+1 for RAII

[GCC14 QUEUE PATCH] RISC-V: Fix RVV register order

2023-03-23 Thread juzhe . zhong

From: Juzhe-Zhong 

This patch fixes the issue of incorrect reigster order of RVV.
The new register order is coming from kito original RVV GCC implementation.

Consider this case:
void f (void *base,void *base2,void *out,size_t vl, int n)
{
vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl);
for (int i = 0; i < n; i++){
  vbool8_t m = __riscv_vlm_v_b8 (base + i, vl);
  vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl);
  vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl);
  vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl);
  vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl);
  __riscv_vse8_v_i8m1 (out + 100*i,v3,vl);
  __riscv_vse8_v_i8m1 (out + 222*i,v4,vl);
}
}

Before this patch:
f:
csrrt0,vlenb
sllit1,t0,3
sub sp,sp,t1
addia5,a0,100
vsetvli zero,a3,e64,m8,ta,ma
vle64.v v24,0(a5)
vs8r.v  v24,0(sp)
ble a4,zero,.L1
mv  a6,a0
add a4,a4,a0
mv  a5,a2
.L3:
vsetvli zero,zero,e64,m8,ta,ma
vl8re64.v   v24,0(sp)
vlm.v   v0,0(a6)
vluxei64.v  v24,(a0),v24,v0.t
addia6,a6,1
vsetvli zero,zero,e8,m1,tu,ma
vmv8r.v v16,v24
vluxei64.v  v8,(a0),v24,v0.t
vle64.v v16,0(a1)
vluxei64.v  v24,(a0),v16,v0.t
vse8.v  v8,0(a2)
vse8.v  v24,0(a5)
addia1,a1,1
addia2,a2,100
addia5,a5,222
bne a4,a6,.L3
.L1:
csrrt0,vlenb
sllit1,t0,3
add sp,sp,t1
jr  ra

After this patch:
f:
addia5,a0,100
vsetvli zero,a3,e64,m8,ta,ma
vle64.v v24,0(a5)
ble a4,zero,.L1
mv  a6,a0
add a4,a4,a0
mv  a5,a2
.L3:
vsetvli zero,zero,e64,m8,ta,ma
vlm.v   v0,0(a6)
addia6,a6,1
vluxei64.v  v8,(a0),v24,v0.t
vsetvli zero,zero,e8,m1,tu,ma
vmv8r.v v16,v8
vluxei64.v  v2,(a0),v8,v0.t
vle64.v v16,0(a1)
vluxei64.v  v1,(a0),v16,v0.t
vse8.v  v2,0(a2)
vse8.v  v1,0(a5)
addia1,a1,1
addia2,a2,100
addia5,a5,222
bne a4,a6,.L3
.L1:
ret

The redundant register spillings is eliminated.
However, there is one more issue need to be addressed which is the redundant 
move instruction "vmv8r.v". This is another story, and it will be fixed by 
another
patch (Fine tune RVV machine description RA constraint).

gcc/ChangeLog:

* config/riscv/riscv.h (enum reg_class): Fix RVV register order.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/spill-4.c: Adapt testcase.
* gcc.target/riscv/rvv/base/spill-6.c: Adapt testcase.
* gcc.target/riscv/rvv/base/reg_order-1.c: New test.

---
 gcc/config/riscv/riscv.h  | 13 
 .../gcc.target/riscv/rvv/base/reg_order-1.c   | 20 
 .../gcc.target/riscv/rvv/base/spill-4.c   | 32 +--
 .../gcc.target/riscv/rvv/base/spill-6.c   | 16 +-
 4 files changed, 50 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/reg_order-1.c

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 66fb07d6652..13038a39e5c 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -553,13 +553,12 @@ enum reg_class
   60, 61, 62, 63,  \
   /* Call-saved FPRs.  */  \
   40, 41, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,  \
-  /* V24 ~ V31.  */\
-  120, 121, 122, 123, 124, 125, 126, 127,  \
-  /* V8 ~ V23.  */ \
-  104, 105, 106, 107, 108, 109, 110, 111,  \
-  112, 113, 114, 115, 116, 117, 118, 119,  \
-  /* V0 ~ V7.  */  \
-  96, 97, 98, 99, 100, 101, 102, 103,  \
+  /* v1 ~ v31 vector registers.  */\
+  97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110,   \
+  111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, \
+  124, 125, 126, 127,  \
+  /* The vector mask register.  */ \
+  96,  \
   /* None of the remaining classes have defined call-saved \
  registers.  */\
   64, 65, 66, 67   \
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/reg_order-1.c 
b/gcc/testsuite/gcc.target/

[GCC14 QUEUE PATCH] RISC-V: Fix RVV register order

2023-03-23 Thread juzhe . zhong

From: Juzhe-Zhong 

Co-authored-by: kito-cheng 
Co-authored-by: kito-cheng 

This patch fixes the issue of incorrect reigster order of RVV.
The new register order is coming from kito original RVV GCC implementation.

Consider this case:
void f (void *base,void *base2,void *out,size_t vl, int n)
{
vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl);
for (int i = 0; i < n; i++){
  vbool8_t m = __riscv_vlm_v_b8 (base + i, vl);
  vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl);
  vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl);
  vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl);
  vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl);
  __riscv_vse8_v_i8m1 (out + 100*i,v3,vl);
  __riscv_vse8_v_i8m1 (out + 222*i,v4,vl);
}
}

Before this patch:
f:
csrrt0,vlenb
sllit1,t0,3
sub sp,sp,t1
addia5,a0,100
vsetvli zero,a3,e64,m8,ta,ma
vle64.v v24,0(a5)
vs8r.v  v24,0(sp)
ble a4,zero,.L1
mv  a6,a0
add a4,a4,a0
mv  a5,a2
.L3:
vsetvli zero,zero,e64,m8,ta,ma
vl8re64.v   v24,0(sp)
vlm.v   v0,0(a6)
vluxei64.v  v24,(a0),v24,v0.t
addia6,a6,1
vsetvli zero,zero,e8,m1,tu,ma
vmv8r.v v16,v24
vluxei64.v  v8,(a0),v24,v0.t
vle64.v v16,0(a1)
vluxei64.v  v24,(a0),v16,v0.t
vse8.v  v8,0(a2)
vse8.v  v24,0(a5)
addia1,a1,1
addia2,a2,100
addia5,a5,222
bne a4,a6,.L3
.L1:
csrrt0,vlenb
sllit1,t0,3
add sp,sp,t1
jr  ra

After this patch:
f:
addia5,a0,100
vsetvli zero,a3,e64,m8,ta,ma
vle64.v v24,0(a5)
ble a4,zero,.L1
mv  a6,a0
add a4,a4,a0
mv  a5,a2
.L3:
vsetvli zero,zero,e64,m8,ta,ma
vlm.v   v0,0(a6)
addia6,a6,1
vluxei64.v  v8,(a0),v24,v0.t
vsetvli zero,zero,e8,m1,tu,ma
vmv8r.v v16,v8
vluxei64.v  v2,(a0),v8,v0.t
vle64.v v16,0(a1)
vluxei64.v  v1,(a0),v16,v0.t
vse8.v  v2,0(a2)
vse8.v  v1,0(a5)
addia1,a1,1
addia2,a2,100
addia5,a5,222
bne a4,a6,.L3
.L1:
ret

The redundant register spillings is eliminated.
However, there is one more issue need to be addressed which is the redundant 
move instruction "vmv8r.v". This is another story, and it will be fixed by 
another
patch (Fine tune RVV machine description RA constraint).

gcc/ChangeLog:

* config/riscv/riscv.h (enum reg_class): Fix RVV register order.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/spill-4.c: Adapt testcase.
* gcc.target/riscv/rvv/base/spill-6.c: Adapt testcase.
* gcc.target/riscv/rvv/base/reg_order-1.c: New test.

Signed-off-by: Ju-Zhe Zhong 
Co-authored-by: kito-cheng 
Co-authored-by: kito-cheng 

---
 gcc/config/riscv/riscv.h  | 13 
 .../gcc.target/riscv/rvv/base/reg_order-1.c   | 20 
 .../gcc.target/riscv/rvv/base/spill-4.c   | 32 +--
 .../gcc.target/riscv/rvv/base/spill-6.c   | 16 +-
 4 files changed, 50 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/reg_order-1.c

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 66fb07d6652..13038a39e5c 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -553,13 +553,12 @@ enum reg_class
   60, 61, 62, 63,  \
   /* Call-saved FPRs.  */  \
   40, 41, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,  \
-  /* V24 ~ V31.  */\
-  120, 121, 122, 123, 124, 125, 126, 127,  \
-  /* V8 ~ V23.  */ \
-  104, 105, 106, 107, 108, 109, 110, 111,  \
-  112, 113, 114, 115, 116, 117, 118, 119,  \
-  /* V0 ~ V7.  */  \
-  96, 97, 98, 99, 100, 101, 102, 103,  \
+  /* v1 ~ v31 vector registers.  */\
+  97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110,   \
+  111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, \
+  124, 125, 126, 127,  \
+  /* The vector mask register.  */ \
+  96,  \
   /* None of the remaining classes have defined call-saved \
  registers.  */\
   64, 65, 66, 67

71 matches

Mail list logo