date:20230326

[PATCH] RISC-V: Add Z*inx incompatible check in gcc.

2023-03-26 Thread Jiawei

Z*inx is conflict with float extensions, add incompatible check when
z*inx and hard_float both enabled.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_option_override): New check.

---
 gcc/config/riscv/riscv.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 76eee4a55e9..162ba14d3c7 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6285,6 +6285,10 @@ riscv_option_override (void)
   && riscv_abi != ABI_LP64 && riscv_abi != ABI_ILP32E)
 error ("z*inx requires ABI ilp32, ilp32e or lp64");
 
+  // Zfinx is conflict with float extensions.
+  if (TARGET_ZFINX && TARGET_HARD_FLOAT)
+error ("z*inx is conflict with float extensions");
+
   /* We do not yet support ILP32 on RV64.  */
   if (BITS_PER_WORD != POINTER_SIZE)
 error ("ABI requires %<-march=rv%d%>", POINTER_SIZE);
-- 
2.25.1

[pushed] doc: Remove anachronistic note related to languages built

2023-03-26 Thread Gerald Pfeifer

Jonathan's patch 
  https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604796.html
lat November made me have a look for further instances, and indeed there
was another one referring to separate tarballs (which we have not been
shipping for a fair bit).

Since the item above already refers to `gcc -v` we can simply drop the
entire list item.

Pushed.

Gerald
---

This is another instance of what ce51e8439a49 (and originally
05432288d4e5) addressed in a different part. We stopped shipping
granular tarballs years ago.

gcc/ChangeLog:

* doc/install.texi: Remove anachronistic note
related to languages built and separate source tarballs.
---
 gcc/doc/install.texi | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 63fc949b447..15aef1394f4 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3481,13 +3481,6 @@ The output of @samp{gcc -v} for your newly installed 
@command{gcc}.
 This tells us which version of GCC you built and the options you passed to
 configure.
 
-@item
-Whether you enabled all languages or a subset of them.  If you used a
-full distribution then this information is part of the configure
-options in the output of @samp{gcc -v}, but if you downloaded the
-``core'' compiler plus additional front ends then it isn't apparent
-which ones you built unless you tell us about it.
-
 @item
 If the build was for GNU/Linux, also include:
 @itemize @bullet
-- 
2.39.2

[wwwdocs] Add Ada's GCC13 changelog entry

2023-03-26 Thread Fernando Oleo Blanco via Gcc-patches

Hi all,

a bit belated but just like last year, I've made a patch for the Ada
entry in the changelog. You can find the patch attached to this email.

If I have forgotten anything relevant or if I have done something
incorrectly, please, say so.

Best regards,
Fernando Oleo BlancoFrom d273bb1835c1ef23e15d422bed22ca5d333cbdae Mon Sep 17 00:00:00 2001
From: Fernando Oleo Blanco 
Date: Sun, 26 Mar 2023 14:20:36 +0200
Subject: [PATCH 1/1] [PATCH] Add Ada's entry in the v13 changelog

Signed-off-by: Fernando Oleo Blanco 
---
 htdocs/gcc-13/changes.html | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index ff70d2ee..2e25bcf5 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -160,7 +160,16 @@ a work-in-progress.
 
 New Languages and Language specific improvements
 
-
+Ada
+
+  Traceback support added in RTEMS for the PPC ELF and ARM architectures.
+  Support for versions older than VxWorks 7 has been removed.
+  General improvements to the contracts in the standard libraries.
+  Addition of GNAT.Binary_Search.
+  Further additions and fixes for the Ada 2022 specification.
+  The Pragma SPARK_Mode=>Auto is now accepted. Contract analysis has been further improved.
+  Documentation improvements.
+
 
 C family
 
-- 
2.40.0

[PATCH] c++, coroutines: Stabilize names of promoted slot vars [PR101118].

2023-03-26 Thread Iain Sandoe via Gcc-patches

Tested on x86_64-darwin21, x86-64-linux-gnu
OK for trunk?
Iain

When we need to 'promote' a value (i.e. store it in the coroutine frame) it
is given a frame entry name.  This was based on the DECL_UID for slot vars.
However, when LTO is used, the names from multiple TUs become visible at the
same time, and the DECL_UIDs usually differ between units.  This leads to a
"ODR mismatch" warning for the frame type.

The fix here is to use a counter instead of the DECL_UID which makes a name
that is stable between TUs for each frame layout (one per coroutine func).

Signed-off-by: Iain Sandoe 

PR c++/101118

gcc/cp/ChangeLog:

* coroutines.cc: Add counter for promoted slot vars.
(flatten_await_stmt): Use slot vars counter instead of DECL_UID
to generate the frame entry name for promoted target expression
slot variables.
(morph_fn_to_coro): Reset the slot vars counter at the start of
each coroutine function.
---
 gcc/cp/coroutines.cc | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index a2189e43db8..359a5bf46ff 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2726,6 +2726,11 @@ struct var_nest_node
   var_nest_node *else_cl;
 };
 
+/* This is used to make a stable, but unique-per-function, sequence number for
+   each TARGET_EXPR slot variable that we 'promote' to a frame entry.  It needs
+   to be stable because the frame type is visible to LTO ODR checking.  */
+static unsigned tmpno = 0;
+
 /* This is called for single statements from the co-await statement walker.
It checks to see if the statement contains any initializers for awaitables
and if any of these capture items by reference.  */
@@ -2889,7 +2894,7 @@ flatten_await_stmt (var_nest_node *n, hash_set 
*promoted,
  tree init = t;
  temps_used->add (init);
  tree var_type = TREE_TYPE (init);
- char *buf = xasprintf ("D.%d", DECL_UID (TREE_OPERAND (init, 0)));
+ char *buf = xasprintf ("T%03u", tmpno++);
  tree var = build_lang_decl (VAR_DECL, get_identifier (buf), var_type);
  DECL_ARTIFICIAL (var) = true;
  free (buf);
@@ -4374,6 +4379,7 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
 {
   gcc_checking_assert (orig && TREE_CODE (orig) == FUNCTION_DECL);
 
+  tmpno = 0;
   *resumer = error_mark_node;
   *destroyer = error_mark_node;
   if (!coro_function_valid_p (orig))
-- 
2.37.1 (Apple Git-137.1)

Re: [PATCH] predict: Don't emit -Wsuggest-attribute=cold warning for functions which already have that attribute [PR105685]

2023-03-26 Thread Jeff Law via Gcc-patches





On 3/25/23 03:53, Jakub Jelinek via Gcc-patches wrote:

Hi!

In the following testcase, we predict baz to have cold
entry regardless of the user supplied attribute (as it call
unconditionally a cold function), but still issue
a -Wsuggest-attribute=cold warning despite it having that attribute
already.

The following patch avoids that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-03-25  Jakub Jelinek  

PR ipa/105685
* predict.cc (compute_function_frequency): Don't call
warn_function_cold if function already has cold attribute.

* c-c++-common/cold-2.c: New test.

OK
jeff

Re: [PATCH] tree-optimization/109237 - last_stmt is possibly slow

2023-03-26 Thread Jeff Law via Gcc-patches





On 3/22/23 06:29, Richard Biener via Gcc-patches wrote:

Most uses of last_stmt are interested in control transfer stmts
and for the testcase gimple_purge_dead_eh_edges shows up in
the profile.  But last_stmt looks past trailing debug stmts but
those would be rejected by GIMPLEs verify_flow_info.  The following
adds possible_ctrl_stmt besides last_stmt which does not look
past trailing debug stmts and adjusts gimple_purge_dead_eh_edges.

I've put checking code into possible_ctrl_stmt that it will not
miss a control statement if the real last stmt is a debug stmt.

The alternative would be to change last_stmt, explicitely introducing
last_nondebug_stmt.  I remember we chickened out and made last_stmt
conservative here but not anticipating the compile-time issues this
creates.  I count 227 last_stmt and 12 last_and_only_stmt uses.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Any opinions?  I probably lean towards s/last_stmt/last_nondebug_stmt/
in next stage1 and then adding last_stmt and changing some
uses back - through for maintainance that's going to be a
nightmare (or maybe not, a "wrong" last_stmt should be safe to
backport and a last_nondebug_stmt will fail to build).
Sounds quite sensible to me.  227+12 isn't terrible and I bet the vast 
majority, should be safe for last_nondebug_stmt.





Richard.

PR tree-optimization/109237
* tree-cfg.h (possible_ctrl_stmt): New function returning
the last stmt not skipping debug stmts.
(gimple_purge_dead_eh_edges): Use it.

OK
jeff

Re: [PATCH] match.pd: Fix up fneg/fadd simplification [PR109230]

2023-03-26 Thread Jeff Law via Gcc-patches





On 3/22/23 04:16, Jakub Jelinek via Gcc-patches wrote:

Hi!

The following testcase is miscompiled on aarch64-linux.  match.pd
has a simplification for addsub, where it negates one of the vectors
in twice as large floating point element vector (effectively negating every
other element) and then doing addition.
But a requirement for that is that the permutation picks the right elements,
in particular 0, nelts+1, 2, nelts+3, 4, nelts+5, ...
The pattern tests this with sel.series_p (0, 2, 0, 2) check, which as
documented verifies that the even elements of the permutation mask are
identity, but doesn't say anything about the others.
The following patch fixes it by also checking that the odd elements
start at nelts + 1 with the same step of 2.

Bootstrapped/regtested on aarch64-linux, x86_64-linux and i686-linux,
ok for trunk?

2023-03-22  Jakub Jelinek  

PR tree-optimization/109230
* match.pd (fneg/fadd simplify): Verify also odd permutation indexes.

* gcc.dg/pr109230.c: New test.

OK
Jeff

Re: [PATCH] rtl-optimization/109237 - speedup bb_is_just_return

2023-03-26 Thread Jeff Law via Gcc-patches





On 3/22/23 04:03, Richard Biener via Gcc-patches wrote:

For the testcase bb_is_just_return is on top of the profile, changing
it to walk BB insns backwards puts it off the profile.  That's because
in the forward walk you have to process possibly many debug insns
but in a backward walk you very likely run into control insns first.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

OK?

For the record, the profile was (after the delete_trivially_dead_insns
fix)

Samples: 289K of event 'cycles:u', Event count (approx.): 384226334976
Overhead   Samples  Command  Shared Object Symbol
3.52%  9747  cc1  cc1   [.] bb_is_just_return
#

and after the fix bb_is_just_return has no recorded samples anymore.

Thanks,
Richard.

PR rtl-optimization/109237
* cfgcleanup.cc (bb_is_just_return): Walk insns backwards.

OK.  Sorry if I introduced this hog.

jeff

Re: [PATCH, commited] Fortran: remove dead code [PR104321]

2023-03-26 Thread Harald Anlauf via Gcc-patches

Hi Paul,

> If you will excuse the British cultural reference, that's a Norwegian Blue 
> alright! Good spot.

ROTFL!

I first had to look up the "Norwegian Blue", and then I remembered.  :)

You're bringing back the fun to gfortran hacking!

Cheers,
Harald

On Sat, 25 Mar 2023 at 19:13, Harald Anlauf via Fortran 
mailto:fort...@gcc.gnu.org]> wrote:Dear all,

I've committed the attached patch from the PR that removes
a dead code snippet, see discussion.

Regtested originally by Tobias, and reconfirmed on
x86_64-pc-linux-gnu.

Pushed as r13-6862-gb5fce899dbbd72 .

Thanks,
Harald

 --
"If you can't explain it simply, you don't understand it well enough" - Albert 
Einstein

Re: [PATCH] predict: Don't emit -Wsuggest-attribute=cold warning for functions which already have that attribute [PR105685]

2023-03-26 Thread Jan Hubicka via Gcc-patches

> Hi!
> 
> In the following testcase, we predict baz to have cold
> entry regardless of the user supplied attribute (as it call
> unconditionally a cold function), but still issue
> a -Wsuggest-attribute=cold warning despite it having that attribute
> already.
> 
> The following patch avoids that.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2023-03-25  Jakub Jelinek  
> 
>   PR ipa/105685
>   * predict.cc (compute_function_frequency): Don't call
>   warn_function_cold if function already has cold attribute.
> 
>   * c-c++-common/cold-2.c: New test.
> 
> --- gcc/predict.cc.jj 2023-01-02 09:32:38.273055726 +0100
> +++ gcc/predict.cc2023-03-24 16:54:13.658606215 +0100
> @@ -4033,7 +4033,9 @@ compute_function_frequency (void)
>  }
>  
>node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
> -  warn_function_cold (current_function_decl);
> +  if (lookup_attribute ("cold", DECL_ATTRIBUTES (current_function_decl))
> +  == NULL)
> +warn_function_cold (current_function_decl);
OK, tanks!
In general we probably want to walk aliases and suggest warning on
aliases attached to the function, but we get this wrong with other
attributes too, so I will add it to TODo for next stage1.

Honza
>if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa() == profile_count::zero ())
>  return;
>FOR_EACH_BB_FN (bb, cfun)
> --- gcc/testsuite/c-c++-common/cold-2.c.jj2023-03-24 16:56:07.344000973 
> +0100
> +++ gcc/testsuite/c-c++-common/cold-2.c   2023-03-24 16:55:58.985119001 
> +0100
> @@ -0,0 +1,19 @@
> +/* PR ipa/105685 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -Wsuggest-attribute=cold" } */
> +
> +extern void foo (char *, char const *, int);
> +
> +__attribute__((cold)) char *
> +bar (int x)
> +{
> +  static char b[42];
> +  foo (b, "foo", x);
> +  return b;
> +}
> +
> +__attribute__((cold)) char *
> +baz (int x)  /* { dg-bogus "function might be candidate for 
> attribute 'cold'" } */
> +{
> +  return bar (x);
> +}
> 
>   Jakub
>

m68k: handle TLS access with offset

2023-03-26 Thread Andreas Schwab

This reinstates FINAL_PRESCAN_INSN, and the calls in handle_move_double,
so that access to TLS variables with offset are properly handled.

gcc:
PR target/106282
* config/m68k/m68k.h (FINAL_PRESCAN_INSN): Define.
* config/m68k/m68k.cc (m68k_final_prescan_insn): Define.
(handle_move_double): Call it before handle_movsi.
* config/m68k/m68k-protos.h: Declare it.

gcc/testsuite:
PR target/106282
* gcc.target/m68k/tls-gd-off.c: New.
* gcc.target/m68k/tls-ie-off.c: New.
* gcc.target/m68k/tls-ld-off.c: New.
* gcc.target/m68k/tls-ld-xtls-off.c: New.
* gcc.target/m68k/tls-le-off.c: New.
* gcc.target/m68k/tls-le-xtls-off.c: New.
* gcc.target/m68k/tls-ld.c: Make pattern less strict.
* gcc.target/m68k/tls-le.c: Likewise.
---
 gcc/config/m68k/m68k-protos.h |  1 +
 gcc/config/m68k/m68k.cc   | 15 +++
 gcc/config/m68k/m68k.h|  3 +++
 .../gcc.target/m68k/{tls-ld.c => tls-gd-off.c}|  7 +++
 .../gcc.target/m68k/{tls-le.c => tls-ie-off.c}|  6 +++---
 .../gcc.target/m68k/{tls-ld.c => tls-ld-off.c}|  8 
 .../m68k/{tls-ld.c => tls-ld-xtls-off.c}  |  8 
 gcc/testsuite/gcc.target/m68k/tls-ld.c|  4 ++--
 .../gcc.target/m68k/{tls-le.c => tls-le-off.c}|  6 +++---
 gcc/testsuite/gcc.target/m68k/tls-le-xtls-off.c   | 13 +
 gcc/testsuite/gcc.target/m68k/tls-le.c|  2 +-
 11 files changed, 52 insertions(+), 21 deletions(-)
 copy gcc/testsuite/gcc.target/m68k/{tls-ld.c => tls-gd-off.c} (52%)
 copy gcc/testsuite/gcc.target/m68k/{tls-le.c => tls-ie-off.c} (62%)
 copy gcc/testsuite/gcc.target/m68k/{tls-ld.c => tls-ld-off.c} (52%)
 copy gcc/testsuite/gcc.target/m68k/{tls-ld.c => tls-ld-xtls-off.c} (57%)
 copy gcc/testsuite/gcc.target/m68k/{tls-le.c => tls-le-off.c} (62%)
 create mode 100644 gcc/testsuite/gcc.target/m68k/tls-le-xtls-off.c

diff --git a/gcc/config/m68k/m68k-protos.h b/gcc/config/m68k/m68k-protos.h
index 60bff796534..724d446af93 100644
--- a/gcc/config/m68k/m68k-protos.h
+++ b/gcc/config/m68k/m68k-protos.h
@@ -84,6 +84,7 @@ extern int emit_move_sequence (rtx *, machine_mode, rtx);
 extern bool m68k_movem_pattern_p (rtx, rtx, HOST_WIDE_INT, bool);
 extern const char *m68k_output_movem (rtx *, rtx, HOST_WIDE_INT, bool);
 extern bool m68k_epilogue_uses (int);
+extern void m68k_final_prescan_insn (rtx_insn *, rtx *, int);
 
 /* Functions from m68k.cc used in constraints.md.  */
 extern rtx m68k_unwrap_symbol (rtx, bool);
diff --git a/gcc/config/m68k/m68k.cc b/gcc/config/m68k/m68k.cc
index 0bff89bc39d..03db2b6a936 100644
--- a/gcc/config/m68k/m68k.cc
+++ b/gcc/config/m68k/m68k.cc
@@ -2550,6 +2550,18 @@ m68k_adjust_decorated_operand (rtx op)
 }
 }
 
+/* Prescan insn before outputing assembler for it.  */
+
+void
+m68k_final_prescan_insn (rtx_insn *insn ATTRIBUTE_UNUSED,
+rtx *operands, int n_operands)
+{
+  int i;
+
+  for (i = 0; i < n_operands; ++i)
+m68k_adjust_decorated_operand (operands[i]);
+}
+
 /* Move X to a register and add REG_EQUAL note pointing to ORIG.
If REG is non-null, use it; generate new pseudo otherwise.  */
 
@@ -3658,6 +3670,7 @@ handle_move_double (rtx operands[2],
 
   /* Normal case: do the two words, low-numbered first.  */
 
+  m68k_final_prescan_insn (NULL, operands, 2);
   handle_movsi (operands);
 
   /* Do the middle one of the three words for long double */
@@ -3668,6 +3681,7 @@ handle_move_double (rtx operands[2],
   if (addreg1)
handle_reg_adjust (addreg1, 4);
 
+  m68k_final_prescan_insn (NULL, middlehalf, 2);
   handle_movsi (middlehalf);
 }
 
@@ -3678,6 +3692,7 @@ handle_move_double (rtx operands[2],
 handle_reg_adjust (addreg1, 4);
 
   /* Do that word.  */
+  m68k_final_prescan_insn (NULL, latehalf, 2);
   handle_movsi (latehalf);
 
   /* Undo the adds we just did.  */
diff --git a/gcc/config/m68k/m68k.h b/gcc/config/m68k/m68k.h
index 6f0bdd8dffa..450c380359c 100644
--- a/gcc/config/m68k/m68k.h
+++ b/gcc/config/m68k/m68k.h
@@ -837,6 +837,9 @@ __transfer_from_trampoline ()   
\
   assemble_name ((FILE), (NAME)),  \
   fprintf ((FILE), ",%u\n", (int)(ROUNDED)))
 
+#define FINAL_PRESCAN_INSN(INSN, OPVEC, NOPERANDS) \
+  m68k_final_prescan_insn (INSN, OPVEC, NOPERANDS)
+
 /* On the 68000, we use several CODE characters:
'.' for dot needed in Motorola-style opcode names.
'-' for an operand pushing on the stack:
diff --git a/gcc/testsuite/gcc.target/m68k/tls-ld.c 
b/gcc/testsuite/gcc.target/m68k/tls-gd-off.c
similarity index 52%
copy from gcc/testsuite/gcc.target/m68k/tls-ld.c
copy to gcc/testsuite/gcc.target/m68k/tls-gd-off.c
index af470c9613a..4af6128ae27 100644
--- a/gcc/testsuite/gcc.target/m68k/tls-ld.c
+++ b/gcc/testsuite/gcc.target/m68k/tls-gd-off.c
@@ -1,14 +1,13 @@
 /* { dg-do compile } */
 /* { dg-skip-if

Re: Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64

2023-03-26 Thread Feng Wang

On 2023-03-26 02:18  Jeff Law wrote:
>
>
>
>On 3/23/23 20:45, juzhe.zh...@rivai.ai wrote:
>> Sounds like you are looking at redundant extension problem in RISC-V port.
>> This is the issue I want to fix but I don't find the time to do that.
>> My first impression is that we need to fix redundant extension in "ree"
>> PASS.
>> I am not sure.
>It's actually quite a bit more complicated.
>
>Some extension elimination can and probably should be happening in
>gimple. In gimple you have access to type information as well as range
>information.  So you have the opportunity to do things like rewrite the
>IL to use different types when it's safe to do so, or to use range
>information to identify when an object is already properly extended and
>thus eliminate the extension before we expand gimple into RTL.
>
>Once in RTL, you can use forward propagation to eliminate extensions, or
>at least fold them into existing operations.  combine can eliminate
>extensions and it has the ability to track (for example) if the upper
>bits are copies of the sign bit, if they're known zero, etc.  combine is
>also capable of recognizing that a load implicitly extends and using
>that knowledge to eliminate extensions or to discover that a pair of
>shifts are just zero or sign extending a value, etc etc.  combine also
>interacts with simplify-rtx which is used by other passes, so there's a
>chance that work in simplify-rtx can eliminate extensions not just in
>combine, but in other passes as well.
>
>REE is a post-register allocation pass and kind of the last chance to
>eliminate extensions.
>
>So for any given redundant extension, the way to go (IMHO) is to walk
>through the optimizer pipeline to see where it can potentially be
>eliminated.  In general, the earlier in the optimizer pipeline the
>extension can be eliminated, the better.
>
>Jeff 
Hi Jeff，Do you think my patch modification is suitable?What else needs to be 
improved？
Thanks.

Feng Wang

Re: [PATCH] RTL: Bugfix for wrong code with v16hi compare & mask

2023-03-26 Thread Hongtao Liu via Gcc-patches

On Sun, Mar 26, 2023 at 3:01 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 3/24/23 08:11, pan2.li--- via Gcc-patches wrote:
> > From: Pan Li 
> >
> > Fix the bug of the incorrect code generation for the
> > below code sample.
> >
> > typedef unsigned short __attribute__((__vector_size__ (32))) V;
> > typedef unsigned short u16;
> >
> > void
> > foo (V m, u16 *ret)
> > {
> >V v = 6 > ((V) { 2049, 8 } & m);
> >*ret = v[0]; // + a + b + c + d;
> > }
> >
> > Before this patch.
> > addisp,sp,-64
> > ld  a5,0(a0)
> > li  a4,528384
> > addia4,a4,-2047
> > and a5,a5,a4
> > // sllia5,a5,48 <- eliminated by mistake
> > // srlia5,a5,48 <- eliminated by mistake
> > sltiu   a5,a5,6
> > negwa5,a5
> > sh  a5,0(a1)
> >
> > After this patch.
> > addisp,sp,-64
> > ld  a5,0(a0)
> > li  a4,528384
> > addia4,a4,-2047
> > and a5,a5,a4
> > sllia5,a5,48
> > srlia5,a5,48
> > sltiu   a5,a5,6
> > negwa5,a5
> > sh  a5,0(a1)
> >
> > The simplify_comparation for the AND operation will
> > try to simplify below RTL code from:
> > (and:DI (subreg:DI (reg:HI 154) 0) (const_int 0x801))
> > to:
> > (subreg:DI (and (reg:HI 154) (const_int 0x801)) 0)
> These look equivalent to me -- assuming they're used as rvalues.
They're equivalent only when WORD_REGISTER_OPERATIONS, orelse the
upper bits of latter is UD, but the former is 0.

 (and (reg:HI 154) (const_int 0x801)) is simplified to (reg:HI 154)
since nonzero_bits (reg:154, HImode) is exactly same as 0x801.

These two optimizations are fine on their own, but if they are put
together, there are problems. The first optimization relies on the
WORD_REGISTER_OPERATIONS, but the second optimize the operation off
which make upper bits of (subreg:DI (reg:HI 154) 0) UD, but originally
it should be 0 after AND (const_int 0x801).
>
>
> >
> > If reg:HI 154 is 0x801 and reg:DI 154 is 0x80801, the RTL will
> > be simplified continuely to:
> That statement has no meaning.  Each pseudo has one and only one native
> mode and you can only refer to it in that mode.  ie reg:HI 154.  reg:DI
> 154 has no meaning.  You might say that (subreg:DI (reg:HI 154) 0) has
> the value 0x80801, but that's OK.  The subreg says those bits outside
> HImode simply don't matter -- you can not depend on them having any
> particular value.
>
> > (subreg:DI (reg:HI 154) 0)
> I think that's equivalent to (subreg:DI (and:HI (reg:HI 154) (const_int
> 0x801)) 0) when used as an rvalue.
>
> I suspect your problem is elsewhere.
>
> jeff
>


-- 
BR,
Hongtao

Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64

2023-03-26 Thread Jeff Law via Gcc-patches





On 3/26/23 19:32, Feng Wang wrote:

On 2023-03-26 02:18  Jeff Law wrote:




On 3/23/23 20:45, juzhe.zh...@rivai.ai wrote:

Sounds like you are looking at redundant extension problem in RISC-V port.
This is the issue I want to fix but I don't find the time to do that.
My first impression is that we need to fix redundant extension in "ree"
PASS.
I am not sure.

It's actually quite a bit more complicated.

Some extension elimination can and probably should be happening in
gimple. In gimple you have access to type information as well as range
information.  So you have the opportunity to do things like rewrite the
IL to use different types when it's safe to do so, or to use range
information to identify when an object is already properly extended and
thus eliminate the extension before we expand gimple into RTL.

Once in RTL, you can use forward propagation to eliminate extensions, or
at least fold them into existing operations.  combine can eliminate
extensions and it has the ability to track (for example) if the upper
bits are copies of the sign bit, if they're known zero, etc.  combine is
also capable of recognizing that a load implicitly extends and using
that knowledge to eliminate extensions or to discover that a pair of
shifts are just zero or sign extending a value, etc etc.  combine also
interacts with simplify-rtx which is used by other passes, so there's a
chance that work in simplify-rtx can eliminate extensions not just in
combine, but in other passes as well.

REE is a post-register allocation pass and kind of the last chance to
eliminate extensions.

So for any given redundant extension, the way to go (IMHO) is to walk
through the optimizer pipeline to see where it can potentially be
eliminated.  In general, the earlier in the optimizer pipeline the
extension can be eliminated, the better.

Jeff

Hi Jeff，Do you think my patch modification is suitable?What else needs to be 
improved？
I haven't looked at it in any detail.  We're in stage4 right now, so 
it's regression bugfixes only going into the tree.  Once gcc-13 branches 
I'll be focused on helping folks move RVV forward, submitting/refining 
various RISC-V patches from Ventana and reviewing other RISC-V related 
patches.


Jeff

[PATCH, rs6000] rs6000: correct vector sign extend built-ins on Big Endian [PR108812]

2023-03-26 Thread HAO CHEN GUI via Gcc-patches

Hi,
  This patch removes byte reverse operation before vector integer sign
extension on Big Endian. These built-ins require to sign extend the rightmost
element. So both BE and LE should do the same operation and the byte reversion
is no need. This patch fixes it. Now these built-ins have the same behavior on
all compilers. The test case is modified also.

  The patch passed regression test on Power Linux platforms.

Thanks
Gui Haochen

ChangeLog
rs6000: correct vector sign extend builtins on Big Endian

gcc/
PR target/108812
* config/rs6000/vsx.md (vsignextend_qi_): Remove byte reverse
for Big Endian.
(vsignextend_hi_): Likewise.
(vsignextend_si_v2di): Remove.
* config/rs6000/rs6000-builtins.def (__builtin_altivec_vsignextsw2d):
Set bif-pattern to vsx_sign_extend_si_v2di.

gcc/testsuite/
PR target/108812
* gcc.target/powerpc/p9-sign_extend-runnable.c: Set different expected
vectors for Big Endian.


patch.diff
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index f76f54793d7..059a455b388 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2699,7 +2699,7 @@
 VSIGNEXTSH2W vsignextend_hi_v4si {}

   const vsll __builtin_altivec_vsignextsw2d (vsi);
-VSIGNEXTSW2D vsignextend_si_v2di {}
+VSIGNEXTSW2D vsx_sign_extend_si_v2di {}

   const vsc __builtin_altivec_vslv (vsc, vsc);
 VSLV vslv {}
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 992fbc983be..9e9b33f56ab 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4941,14 +4941,7 @@ (define_expand "vsignextend_qi_"
 UNSPEC_VSX_SIGN_EXTEND))]
   "TARGET_P9_VECTOR"
 {
-  if (BYTES_BIG_ENDIAN)
-{
-  rtx tmp = gen_reg_rtx (V16QImode);
-  emit_insn (gen_altivec_vrevev16qi2(tmp, operands[1]));
-  emit_insn (gen_vsx_sign_extend_qi_(operands[0], tmp));
-}
-  else
-emit_insn (gen_vsx_sign_extend_qi_(operands[0], operands[1]));
+  emit_insn (gen_vsx_sign_extend_qi_(operands[0], operands[1]));
   DONE;
 })

@@ -4968,14 +4961,7 @@ (define_expand "vsignextend_hi_"
 UNSPEC_VSX_SIGN_EXTEND))]
   "TARGET_P9_VECTOR"
 {
-  if (BYTES_BIG_ENDIAN)
-{
-  rtx tmp = gen_reg_rtx (V8HImode);
-  emit_insn (gen_altivec_vrevev8hi2(tmp, operands[1]));
-  emit_insn (gen_vsx_sign_extend_hi_(operands[0], tmp));
-}
-  else
- emit_insn (gen_vsx_sign_extend_hi_(operands[0], operands[1]));
+  emit_insn (gen_vsx_sign_extend_hi_(operands[0], operands[1]));
   DONE;
 })

@@ -4987,24 +4973,6 @@ (define_insn "vsx_sign_extend_si_v2di"
   "vextsw2d %0,%1"
   [(set_attr "type" "vecexts")])

-(define_expand "vsignextend_si_v2di"
-  [(set (match_operand:V2DI 0 "vsx_register_operand" "=v")
-   (unspec:V2DI [(match_operand:V4SI 1 "vsx_register_operand" "v")]
-UNSPEC_VSX_SIGN_EXTEND))]
-  "TARGET_P9_VECTOR"
-{
-  if (BYTES_BIG_ENDIAN)
-{
-   rtx tmp = gen_reg_rtx (V4SImode);
-
-   emit_insn (gen_altivec_vrevev4si2(tmp, operands[1]));
-   emit_insn (gen_vsx_sign_extend_si_v2di(operands[0], tmp));
-}
-  else
- emit_insn (gen_vsx_sign_extend_si_v2di(operands[0], operands[1]));
-  DONE;
-})
-
 ;; Sign extend DI to TI.  We provide both GPR targets and Altivec targets on
 ;; power10.  On earlier systems, the machine independent code will generate a
 ;; shift left to sign extend the 64-bit value to 128-bit.
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c
index fdcad019b96..03c0f1201e4 100644
--- a/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c
@@ -34,7 +34,12 @@ int main ()
   /* test sign extend byte to word */
   vec_arg_qi = (vector signed char) {1, 2, 3, 4, 5, 6, 7, 8,
 -1, -2, -3, -4, -5, -6, -7, -8};
+
+#ifdef __BIG_ENDIAN__
+  vec_expected_wi = (vector signed int) {4, 8, -4, -8};
+#else
   vec_expected_wi = (vector signed int) {1, 5, -1, -5};
+#endif

   vec_result_wi = vec_signexti (vec_arg_qi);

@@ -54,7 +59,12 @@ int main ()
   /* test sign extend byte to double */
   vec_arg_qi = (vector signed char){1, 2, 3, 4, 5, 6, 7, 8,
-1, -2, -3, -4, -5, -6, -7, -8};
+
+#ifdef __BIG_ENDIAN__
+  vec_expected_di = (vector signed long long int){8, -8};
+#else
   vec_expected_di = (vector signed long long int){1, -1};
+#endif

   vec_result_di = vec_signextll(vec_arg_qi);

@@ -72,7 +82,12 @@ int main ()

   /* test sign extend short to word */
   vec_arg_hi = (vector signed short int){1, 2, 3, 4, -1, -2, -3, -4};
+
+#ifdef __BIG_ENDIAN__
+  vec_expected_wi = (vector signed int){2, 4, -2, -4};
+#else
   vec_expected_wi = (vector signed int){1, 3, -1, -3};
+#endif

   vec_result_wi = vec_signexti(vec_arg_hi);

@@ -90,7 +105,12 @@ int main ()

   /* test sign

Re: [PATCH] c++, coroutines: Stabilize names of promoted slot vars [PR101118].

2023-03-26 Thread Richard Biener via Gcc-patches

On Sun, Mar 26, 2023 at 6:55 PM Iain Sandoe via Gcc-patches
 wrote:
>
> Tested on x86_64-darwin21, x86-64-linux-gnu
> OK for trunk?
> Iain
>
> When we need to 'promote' a value (i.e. store it in the coroutine frame) it
> is given a frame entry name.  This was based on the DECL_UID for slot vars.
> However, when LTO is used, the names from multiple TUs become visible at the
> same time, and the DECL_UIDs usually differ between units.  This leads to a
> "ODR mismatch" warning for the frame type.
>
> The fix here is to use a counter instead of the DECL_UID which makes a name
> that is stable between TUs for each frame layout (one per coroutine func).

I don't see how this avoids clashes across TUs?  But are those VAR_DECLs not
local anyway?

I suppose -Wodr diagnostics for DECL_ARTIFICIAL vars are a bit on the
edge as well ...

Richard.

> Signed-off-by: Iain Sandoe 
>
> PR c++/101118
>
> gcc/cp/ChangeLog:
>
> * coroutines.cc: Add counter for promoted slot vars.
> (flatten_await_stmt): Use slot vars counter instead of DECL_UID
> to generate the frame entry name for promoted target expression
> slot variables.
> (morph_fn_to_coro): Reset the slot vars counter at the start of
> each coroutine function.
> ---
>  gcc/cp/coroutines.cc | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
> index a2189e43db8..359a5bf46ff 100644
> --- a/gcc/cp/coroutines.cc
> +++ b/gcc/cp/coroutines.cc
> @@ -2726,6 +2726,11 @@ struct var_nest_node
>var_nest_node *else_cl;
>  };
>
> +/* This is used to make a stable, but unique-per-function, sequence number 
> for
> +   each TARGET_EXPR slot variable that we 'promote' to a frame entry.  It 
> needs
> +   to be stable because the frame type is visible to LTO ODR checking.  */
> +static unsigned tmpno = 0;
> +
>  /* This is called for single statements from the co-await statement walker.
> It checks to see if the statement contains any initializers for awaitables
> and if any of these capture items by reference.  */
> @@ -2889,7 +2894,7 @@ flatten_await_stmt (var_nest_node *n, hash_set 
> *promoted,
>   tree init = t;
>   temps_used->add (init);
>   tree var_type = TREE_TYPE (init);
> - char *buf = xasprintf ("D.%d", DECL_UID (TREE_OPERAND (init, 0)));
> + char *buf = xasprintf ("T%03u", tmpno++);
>   tree var = build_lang_decl (VAR_DECL, get_identifier (buf), 
> var_type);
>   DECL_ARTIFICIAL (var) = true;
>   free (buf);
> @@ -4374,6 +4379,7 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
> *destroyer)
>  {
>gcc_checking_assert (orig && TREE_CODE (orig) == FUNCTION_DECL);
>
> +  tmpno = 0;
>*resumer = error_mark_node;
>*destroyer = error_mark_node;
>if (!coro_function_valid_p (orig))
> --
> 2.37.1 (Apple Git-137.1)
>

Re: [PATCH] lto/109263 - lto-wrapper and -g0 -ggdb

2023-03-26 Thread Richard Biener via Gcc-patches

On Thu, 23 Mar 2023, Richard Biener wrote:

> The following makes lto-wrapper deal with non-combined debug
> disabling / enabling option combinations properly.  Interestingly
> -gno-dwarf also enables debug.
> 
> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
> 
> OK?  Or do we want to try harder to zap earlier -g0 when later
> -g* appear?

I pushed this to fix the regression, the patch stays valid even
when the patches rejecting negative variants of -ggdb and friends
is approved.

Richard.

>   PR lto/109263
>   * lto-wrapper.c (run_gcc): Parse alternate debug options
>   as well, they always enable debug.
> ---
>  gcc/lto-wrapper.cc | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc
> index fe8c5f6e80d..5186d040ce0 100644
> --- a/gcc/lto-wrapper.cc
> +++ b/gcc/lto-wrapper.cc
> @@ -1564,6 +1564,16 @@ run_gcc (unsigned argc, char *argv[])
> skip_debug = option->arg && !strcmp (option->arg, "0");
> break;
>  
> + case OPT_gbtf:
> + case OPT_gctf:
> + case OPT_gdwarf:
> + case OPT_gdwarf_:
> + case OPT_ggdb:
> + case OPT_gvms:
> +   /* Negative forms, if allowed, enable debug info as well.  */
> +   skip_debug = false;
> +   break;
> +
>   case OPT_dumpdir:
> incoming_dumppfx = dumppfx = option->arg;
> break;
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH] c++, coroutines: Stabilize names of promoted slot vars [PR101118].

2023-03-26 Thread Iain Sandoe

Hi Richard,
(I’m away from my usual infrastructure, so responses could be slow and testing 
things
could take a while).

> On 27 Mar 2023, at 12:10, Richard Biener  wrote:
> 
> On Sun, Mar 26, 2023 at 6:55 PM Iain Sandoe via Gcc-patches
>  wrote:
>> 
>> Tested on x86_64-darwin21, x86-64-linux-gnu
>> OK for trunk?
>> Iain
>> 
>> When we need to 'promote' a value (i.e. store it in the coroutine frame) it
>> is given a frame entry name.  This was based on the DECL_UID for slot vars.
>> However, when LTO is used, the names from multiple TUs become visible at the
>> same time, and the DECL_UIDs usually differ between units.  This leads to a
>> "ODR mismatch" warning for the frame type.
>> 
>> The fix here is to use a counter instead of the DECL_UID which makes a name
>> that is stable between TUs for each frame layout (one per coroutine func).
> 
> I don't see how this avoids clashes across TUs?  But are those VAR_DECLs not
> local anyway?

The reported ODR issue is in the frame type (which is a structure) — it sees two
frame layouts with the same types for each field but a different name for the 
entries
that came from the promotion of the slot var (because I used the DECL_UID to 
generate
the field name).

> I suppose -Wodr diagnostics for DECL_ARTIFICIAL vars are a bit on the
> edge as well ...

These promoted vars get DECL_VALUE_EXPRs (and as noted above a name to
assist in debugging) tying them to the frame entry,

.. although  I do agree that reporting warnings for compiler-internal stuff is 
definitely
on the edge (ISTR seeing maybe unused reports against such too).

Not sure if we have an easy way to tell that the frame type is an internal one 
tho. 
Perhaps that needs a DECL_ARTIFICAL - but would that not make it unavailable
for debug?

Iain


> 
> Richard.
> 
>> Signed-off-by: Iain Sandoe 
>> 
>>PR c++/101118
>> 
>> gcc/cp/ChangeLog:
>> 
>>* coroutines.cc: Add counter for promoted slot vars.
>>(flatten_await_stmt): Use slot vars counter instead of DECL_UID
>>to generate the frame entry name for promoted target expression
>>slot variables.
>>(morph_fn_to_coro): Reset the slot vars counter at the start of
>>each coroutine function.
>> ---
>> gcc/cp/coroutines.cc | 8 +++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>> 
>> diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
>> index a2189e43db8..359a5bf46ff 100644
>> --- a/gcc/cp/coroutines.cc
>> +++ b/gcc/cp/coroutines.cc
>> @@ -2726,6 +2726,11 @@ struct var_nest_node
>>   var_nest_node *else_cl;
>> };
>> 
>> +/* This is used to make a stable, but unique-per-function, sequence number 
>> for
>> +   each TARGET_EXPR slot variable that we 'promote' to a frame entry.  It 
>> needs
>> +   to be stable because the frame type is visible to LTO ODR checking.  */
>> +static unsigned tmpno = 0;
>> +
>> /* This is called for single statements from the co-await statement walker.
>>It checks to see if the statement contains any initializers for awaitables
>>and if any of these capture items by reference.  */
>> @@ -2889,7 +2894,7 @@ flatten_await_stmt (var_nest_node *n, hash_set 
>> *promoted,
>>  tree init = t;
>>  temps_used->add (init);
>>  tree var_type = TREE_TYPE (init);
>> - char *buf = xasprintf ("D.%d", DECL_UID (TREE_OPERAND (init, 0)));
>> + char *buf = xasprintf ("T%03u", tmpno++);
>>  tree var = build_lang_decl (VAR_DECL, get_identifier (buf), 
>> var_type);
>>  DECL_ARTIFICIAL (var) = true;
>>  free (buf);
>> @@ -4374,6 +4379,7 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
>> *destroyer)
>> {
>>   gcc_checking_assert (orig && TREE_CODE (orig) == FUNCTION_DECL);
>> 
>> +  tmpno = 0;
>>   *resumer = error_mark_node;
>>   *destroyer = error_mark_node;
>>   if (!coro_function_valid_p (orig))
>> --
>> 2.37.1 (Apple Git-137.1)

[PATCH] RISC-V: Fix PR108279

2023-03-26 Thread juzhe . zhong

From: Juzhe-Zhong 

PR 108270

Fix bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108270.

Consider the following testcase:
void f (void * restrict in, void * restrict out, int l, int n, int m)
{
  for (int i = 0; i < l; i++){
for (int j = 0; j < m; j++){
  for (int k = 0; k < n; k++)
{
  vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + j, 17);
  __riscv_vse8_v_i8mf8 (out + i + j, v, 17);
}
}
  }
}

Compile option: -O3

Before this patch:
mv  a7,a2
mv  a6,a0   
mv  t1,a1
mv  a2,a3
vsetivlizero,17,e8,mf8,ta,ma
...

After this patch:
mv  a7,a2
mv  a6,a0
mv  t1,a1
mv  a2,a3
ble a7,zero,.L1
ble a4,zero,.L1
ble a3,zero,.L1
add a1,a0,a4
li  a0,0
vsetivlizero,17,e8,mf8,ta,ma
...

It will produce potential bug when:

int main ()
{
  vsetivli zero, 100,.
  f (in, out, 0,0,0)
  asm volatile ("csrr a0,vl":::"memory");

  // Before this patch the a0 is 17. (Wrong).
  // After this patch the a0 is 100. (Correct).
  ...
}

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc 
(vector_infos_manager::all_empty_predecessor_p): New function.
(pass_vsetvl::backward_demand_fusion): Fix bug.
* config/riscv/riscv-vsetvl.h: New function declare.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: Adapt test.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-3.c: Adapt test.
* gcc.target/riscv/rvv/vsetvl/pr108270.c: New test.

---
 gcc/config/riscv/riscv-vsetvl.cc  | 24 +++
 gcc/config/riscv/riscv-vsetvl.h   |  2 ++
 .../riscv/rvv/vsetvl/imm_bb_prop-1.c  |  2 +-
 .../riscv/rvv/vsetvl/imm_conflict-3.c |  4 ++--
 .../gcc.target/riscv/rvv/vsetvl/pr108270.c| 19 +++
 5 files changed, 48 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr108270.c

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index b5f5301ea43..4948e5d4c5e 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2361,6 +2361,21 @@ vector_infos_manager::all_same_ratio_p (sbitmap bitdata) 
const
   return true;
 }
 
+bool
+vector_infos_manager::all_empty_predecessor_p (const basic_block cfg_bb) const
+{
+  hash_set pred_cfg_bbs = get_all_predecessors (cfg_bb);
+  for (const basic_block pred_cfg_bb : pred_cfg_bbs)
+{
+  const auto &pred_block_info = vector_block_infos[pred_cfg_bb->index];
+  if (!pred_block_info.local_dem.valid_or_dirty_p ()
+ && !pred_block_info.reaching_out.valid_or_dirty_p ())
+   continue;
+  return false;
+}
+  return true;
+}
+
 bool
 vector_infos_manager::all_same_avl_p (const basic_block cfg_bb,
  sbitmap bitdata) const
@@ -3118,6 +3133,14 @@ pass_vsetvl::backward_demand_fusion (void)
   if (!backward_propagate_worthwhile_p (cfg_bb, curr_block_info))
continue;
 
+  /* Fix PR108270:
+
+   bb 0 -> bb 1
+We don't need to backward fuse VL/VTYPE info from bb 1 to bb 0
+if bb 1 is not inside a loop and all predecessors of bb 0 are empty. */
+  if (m_vector_manager->all_empty_predecessor_p (cfg_bb))
+   continue;
+
   edge e;
   edge_iterator ei;
   /* Backward propagate to each predecessor.  */
@@ -3131,6 +3154,7 @@ pass_vsetvl::backward_demand_fusion (void)
continue;
  if (e->src->index == ENTRY_BLOCK_PTR_FOR_FN (cfun)->index)
continue;
+
  /* If prop is demand of vsetvl instruction and reaching doesn't demand
 AVL. We don't backward propagate since vsetvl instruction has no
 side effects.  */
diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h
index 237381f7026..eec03d35071 100644
--- a/gcc/config/riscv/riscv-vsetvl.h
+++ b/gcc/config/riscv/riscv-vsetvl.h
@@ -450,6 +450,8 @@ public:
   /* Return true if all expression set in bitmap are same ratio.  */
   bool all_same_ratio_p (sbitmap) const;
 
+  bool all_empty_predecessor_p (const basic_block) const;
+
   void release (void);
   void create_bitmap_vectors (void);
   void free_bitmap_vectors (void);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c
index cd4ee7dd0d3..ed32a40f5e7 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c
@@ -29,4 +29,4 @@ void f (int8_t * restrict in, int8_t * restrict out, int n, 
int cond)
   }
 }
 
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*5,\s*e8,\s*mf8,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" 
no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,

[PATCH] RISC-V: Add Z*inx incompatible check in gcc.

[pushed] doc: Remove anachronistic note related to languages built

[wwwdocs] Add Ada's GCC13 changelog entry

[PATCH] c++, coroutines: Stabilize names of promoted slot vars [PR101118].

Re: [PATCH] predict: Don't emit -Wsuggest-attribute=cold warning for functions which already have that attribute [PR105685]

Re: [PATCH] tree-optimization/109237 - last_stmt is possibly slow

Re: [PATCH] match.pd: Fix up fneg/fadd simplification [PR109230]

Re: [PATCH] rtl-optimization/109237 - speedup bb_is_just_return

Re: [PATCH, commited] Fortran: remove dead code [PR104321]

Re: [PATCH] predict: Don't emit -Wsuggest-attribute=cold warning for functions which already have that attribute [PR105685]

m68k: handle TLS access with offset

Re: Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64

Re: [PATCH] RTL: Bugfix for wrong code with v16hi compare & mask

Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64

[PATCH, rs6000] rs6000: correct vector sign extend built-ins on Big Endian [PR108812]

Re: [PATCH] c++, coroutines: Stabilize names of promoted slot vars [PR101118].

Re: [PATCH] lto/109263 - lto-wrapper and -g0 -ggdb

Re: [PATCH] c++, coroutines: Stabilize names of promoted slot vars [PR101118].

[PATCH] RISC-V: Fix PR108279

19 matches

Site Navigation

Mail list logo

Footer information