date:20210710

[PATCH take 2] PR tree-optimization/38943: Preserve trapping instructions with -fpreserve-traps

2021-07-10 Thread Roger Sayle

Hi Richard and Eric,
Of course, you're both completely right.  Rather than argue that
-fnon-call-exceptions without -fexceptions (and without
-fdelete-dead-exceptions) has some implicit undocumented semantics,
trapping instructions should be completely orthogonal to exception
handling.

This patch adds a new code generation option -fpreserve-traps, the
(obvious) semantics of which is demonstrated by the expanded test
case below.  The current behaviour of gcc is to eliminate calls
to may_trap_1, may_trap_2, may_trap_3 etc. from foo, but these are
all retained with -fpreserve-traps.

Historically, the semantics of -fnon-call-exceptions vs. traps has
been widely misunderstood, with different levels of optimization
producing different outcomes, as shown by the impressive list of PRs
affected by this solution.  Hopefully, this new documentation will
clarify things.

This patch has been tested on x86_64-pc-linux-gnu with a "make
bootstrap" and "make -k check" with no new failures.

Ok for mainline?

2021-07-09  Roger Sayle  
Eric Botcazou  
Richard Biener  

gcc/ChangeLog
PR tree-optimization/38943
PR middle-end/39801
PR middle-end/64711
PR target/70387
PR tree-optimization/94357
* common.opt (fpreserve-traps): New code generation option.
* doc/invoke.texi (-fpreserve-traps): Document new option.
* gimple.c (gimple_has_side_effects): Consider trapping to
be a side-effect when -fpreserve-traps is specified.
(gimple_could_trap_p_1):  Make S argument a "const gimple*".
Preserve constness in call to gimple_asm_volatile_p.
(gimple_could_trap_p): Make S argument a "const gimple*".
* gimple.h (gimple_could_trap_p_1, gimple_could_trap_p):
Update function prototypes.
* ipa-pure-const.c (check_stmt): When preserving traps,
a trapping statement should be considered a side-effect,
so the function is neither const nor pure.

gcc/testsuite/ChangeLog
PR tree-optimization/38943
PR middle-end/39801
PR middle-end/64711
PR target/70387
PR tree-optimization/94357
* gcc.dg/pr38943.c: New test case.

--
Roger Sayle, PhD.
CEO and founder
NextMove Software Limited
Registered in England No. 07588305
Registered Office: Innovation Centre, 320 Cambridge Science Park, Cambridge, 
CB4 0WG

-Original Message-
From: Richard Biener  
Sent: 08 July 2021 11:19
To: Roger Sayle ; Eric Botcazou 

Cc: GCC Patches 
Subject: Re: [PATCH] PR tree-optimization/38943: Preserve trapping instructions 
with -fnon-call-exceptions

On Thu, Jul 8, 2021 at 11:54 AM Roger Sayle  wrote:
>
>
> This patch addresses PR tree-optimization/38943 where gcc may optimize 
> away trapping instructions even when -fnon-call-exceptions is specified.
> Interestingly this only affects the C compiler (when -fexceptions is 
> not
> specified) as g++ (or -fexceptions) supports C++-style exception 
> handling, where -fnon-call-exceptions triggers the stmt_could_throw_p 
> machinery.
> Without -fexceptions, trapping instructions aren't always considered 
> visible side-effects.

But -fnon-call-exceptions without -fexceptions doesn't make much sense, does 
it?  I see the testcase behaves correctly when -fexceptions is also specified.

The call vanishes in DCE because stmt_could_throw_p starts with

bool
stmt_could_throw_p (function *fun, gimple *stmt) {
  if (!flag_exceptions)
return false;

the documentation of -fnon-call-exceptions says

Generate code that allows trapping instructions to throw exceptions.

so either the above check is wrong or -fnon-call-exceptions should imply 
-fexceptions (or we should diagnose missing -fexceptions)

>
> This patch fixes this in two place.  Firstly, gimple_has_side_effects 
> is tweaked such that gimple_could_trap_p is considered a side-effect 
> if the current function can throw non-call exceptions.

But exceptions are not considered side-effects - they are explicit in the IL 
and thus passes are supposed to check for those and preserve dead (externally) 
throwing stmts if not told otherwise (flag_delete_dead_exceptions).

>  And secondly,
> check_stmt in ipa-pure-const.c is tweaked such that a function 
> containing a trapping statement is considered to have a side-effect 
> with -fnon-call-exceptions, and therefore cannot be pure or const.

EH is orthogonal to pure/const, so I think that's wrong.

> Calling gimple_could_trap_p (which previously took a non-const gimple) 
> from gimple_has_side_effects (which takes a const gimple) required 
> improving the const-safety of gimple_could_trap_p (a relatively minor
> tweak) and its prototypes.  Hopefully this is considered a clean-up/ 
> improvement.

Yeah, even an obvious one I think - you can push that independently.

> This patch has been tested on x86_64-pc-linux-gnu with a "make 
> bootstrap" and "make -k check" with no new failures.  This should be 
> relatively safe, as there

Re: [PATCH] PR tree-opt/40210: Fold (bswap(X)>>C1)&C2 to (X>>C3)&C2 in match.pd

2021-07-10 Thread H.J. Lu via Gcc-patches

On Thu, Jul 8, 2021 at 2:51 AM Richard Biener via Gcc-patches
 wrote:
>
> On Thu, Jul 8, 2021 at 9:37 AM Roger Sayle  wrote:
> >
> >
> > Hi Richard,
> > Thanks. Yep, you've correctly the diagnosed that the motivation for the
> > get_builtin_precision helper function was that the TREE_TYPE of the
> > argument is affected by argument promotion.  Your suggestion to instead
> > use the TREE_TYPE of the function result is a much nicer solution.
> >
> > I also agree that that all of these bswap optimizations make the assumption
> > that BITS_PER_UNIT is 8 (i.e. that bytes are 8-bits), and some that the
> > front-end supports an 8-bit type (i.e. that CHAR_TYPE_SIZE is 8), which
> > can be checked explicitly.
> >
> > Both of these improvements are implemented in the attached revised patch,
> > which has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
> > and "make -k check" with no new failures.
> >
> > Ok for mainline?
>
> OK.
>
> Thanks,
> Richard.
>
> > 2021-07-08  Roger Sayle  
> > Richard Biener  
> >
> > gcc/ChangeLog
> > PR tree-optimization/40210
> > * match.pd (bswap optimizations): Simplify (bswap(x)>>C1)&C2 as
> > (x>>C3)&C2 when possible.  Simplify bswap(x)>>C1 as ((T)x)>>C2
> > when possible.  Simplify bswap(x)&C1 as (x>>C2)&C1 when 0<=C1<=255.
> >
> > gcc/testsuite/ChangeLog
> > PR tree-optimization/40210
> > * gcc.dg/builtin-bswap-13.c: New test.
> > * gcc.dg/builtin-bswap-14.c: New test.
> >
> > Roger
> > --
> >
> > -Original Message-
> > From: Richard Biener 
> > Sent: 07 July 2021 08:56
> > To: Roger Sayle 
> > Cc: GCC Patches 
> > Subject: Re: [PATCH] PR tree-opt/40210: Fold (bswap(X)>>C1)&C2 to 
> > (X>>C3)&C2 in match.pd
> >
> > On Tue, Jul 6, 2021 at 9:01 PM Roger Sayle  
> > wrote:
> > >
> > >
> > > All of the optimizations/transformations mentioned in bugzilla for PR
> > > tree-optimization/40210 are already implemented in mainline GCC, with
> > > one exception.  In comment #5, there's a suggestion that
> > > (bswap64(x)>>56)&0xff can be implemented without the bswap as
> > > (unsigned char)x, or equivalently x&0xff.
> > >
> > > This patch implements the above optimization, and closely related
> > > variants.  For any single bit, (bswap(X)>>C1)&1 can be simplified to
> > > (X>>C2)&1, where bit position C2 is the appropriate permutation of C1.
> > > Similarly, the bswap can eliminated if the desired set of bits all lie
> > > within the same byte, hence (bswap(x)>>8)&255 can always be optimized,
> > > as can (bswap(x)>>8)&123.
> > >
> > > Previously,
> > >
> > > int foo(long long x) {
> > >   return (__builtin_bswap64(x) >> 56) & 0xff; }
> > >
> > > compiled with -O2 to
> > > foo:movq%rdi, %rax
> > > bswap   %rax
> > > shrq$56, %rax
> > > ret
> > >
> > > with this patch, it now compiles to
> > > foo:movzbl  %dil, %eax
> > > ret
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with a "make
> > > bootstrap" and "make -k check" with no new failures.
> > >
> > > Ok for mainline?
> >
> > I don't like get_builtin_precision too much, did you consider simply using
> >
> > +  (bit_and (convert1? (rshift@0 (convert2? (bswap@3 @1))
> > + INTEGER_CST@2))
> >
> > and TYPE_PRECISION (TREE_TYPE (@3))?  I think while we'll see argument 
> > promotion and thus cannot use @1 to derive the type the return value will 
> > be the original type.
> >
> > Now, I see '8' being used which likely should be CHAR_TYPE_SIZE since you 
> > also use char_type_node.
> >
> > I wonder whether
> >
> > + /* (bswap(x) >> C1) & C2 can sometimes be simplified to (x >> C3) &
> > + C2.  */ (simplify  (bit_and (convert1? (rshift@0 (convert2? (bswap
> > + @1)) INTEGER_CST@2))
> > +  INTEGER_CST@3)
> >
> > and
> >
> > + /* bswap(x) >> C1 can sometimes be simplified to (T)x >> C2.  */
> > + (simplify  (rshift (convert? (bswap @0)) INTEGER_CST@1)
> >
> > can build upon each other, for example by extending the latter to handle 
> > more cases, transforming to ((T)x >> C2) & C3?
> > That might of course be only profitable when the bswap goes away.
> >
> > Thanks,
> > Richard.
> >
> > >
> > >
> > > 2021-07-06  Roger Sayle  
> > >
> > > gcc/ChangeLog
> > > PR tree-optimization/40210
> > > * builtins.c (get_builtin_precision): Helper function to determine
> > > the precision in bits of a built-in function.
> > > * builtins.h (get_builtin_precision): Prototype here.
> > > * match.pd (bswap optimizations): Simplify (bswap(x)>>C1)&C2 as
> > > (x>>C3)&C2 when possible.  Simplify bswap(x)>>C1 as ((T)x)>>C2
> > > when possible.  Simplify bswap(x)&C1 as (x>>C2)&C1 when 
> > > 0<=C1<=255.
> > >
> > > gcc/testsuite/ChangeLog
> > > PR tree-optimization/40210
> > > * gcc.dg/builtin-bswap-13.c: New test.
> > > * gcc.dg/builtin-bswap-14.c: New test.
> > >
> > > Roger
> > > --
> > > Roger Sayle
> > > NextMove Software
> > > C

Re: [COMMITTED] Fix relation query of equivalences.

2021-07-10 Thread H.J. Lu via Gcc-patches

On Thu, Jun 24, 2021 at 10:36 AM Andrew MacLeod via Gcc-patches
 wrote:
>
>
> When looking for relations between equivalencies, a typo was causing the
> same bitmap to be checked for both operands, instead of the correct one
> for each.   This caused us to never notice relations between equivalences.
>
> I also noticed that under some circumstances the relation dump would
> call blocks which were NULL and trap.. Also fixed.
>
> bootstraps on x86_64-pc-linux-gnu with no regressions.  pushed.
>
> Andrew
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101335

-- 
H.J.

[PATCH 1/2] analyzer: refactor callstring to work with pairs of supernodes [GSoC]

2021-07-10 Thread Ankur Saini via Gcc-patches

2021-07-3  Ankur Saini  

* gcc/analyzer/call-string.cc: refactor callstring to work with pair of 
supernodes instead of super superedges
* gcc/analyzer/call-string.h: make callstring work with pairs of 
supernodes
* gcc/analyzer/program-point.cc: refactor program point to work with 
new call-string format
---
 gcc/analyzer/call-string.cc   | 93 +--
 gcc/analyzer/call-string.h| 20 +---
 gcc/analyzer/program-point.cc |  9 ++--
 3 files changed, 74 insertions(+), 48 deletions(-)

diff --git a/gcc/analyzer/call-string.cc b/gcc/analyzer/call-string.cc
index 9f4f77ab3a9..50dfb9e8c7c 100644
--- a/gcc/analyzer/call-string.cc
+++ b/gcc/analyzer/call-string.cc
@@ -48,10 +48,10 @@ along with GCC; see the file COPYING3.  If not see
 /* call_string's copy ctor.  */
 
 call_string::call_string (const call_string &other)
-: m_return_edges (other.m_return_edges.length ())
+: m_supernodes (other.m_supernodes.length ())
 {
-  for (const return_superedge *e : other.m_return_edges)
-m_return_edges.quick_push (e);
+  for (const std::pair *e : 
other.m_supernodes)
+m_supernodes.quick_push (e);
 }
 
 /* call_string's assignment operator.  */
@@ -60,12 +60,12 @@ call_string&
 call_string::operator= (const call_string &other)
 {
   // would be much simpler if we could rely on vec<> assignment op
-  m_return_edges.truncate (0);
-  m_return_edges.reserve (other.m_return_edges.length (), true);
-  const return_superedge *e;
+  m_supernodes.truncate (0);
+  m_supernodes.reserve (other.m_supernodes.length (), true);
+  const std::pair *e;
   int i;
-  FOR_EACH_VEC_ELT (other.m_return_edges, i, e)
-m_return_edges.quick_push (e);
+  FOR_EACH_VEC_ELT (other.m_supernodes, i, e)
+m_supernodes.quick_push (e);
   return *this;
 }
 
@@ -74,12 +74,12 @@ call_string::operator= (const call_string &other)
 bool
 call_string::operator== (const call_string &other) const
 {
-  if (m_return_edges.length () != other.m_return_edges.length ())
+  if (m_supernodes.length () != other.m_supernodes.length ())
 return false;
-  const return_superedge *e;
+  const std::pair *e;
   int i;
-  FOR_EACH_VEC_ELT (m_return_edges, i, e)
-if (e != other.m_return_edges[i])
+  FOR_EACH_VEC_ELT (m_supernodes, i, e)
+if (e != other.m_supernodes[i])
   return false;
   return true;
 }
@@ -91,15 +91,15 @@ call_string::print (pretty_printer *pp) const
 {
   pp_string (pp, "[");
 
-  const return_superedge *e;
+  const std::pair *e;
   int i;
-  FOR_EACH_VEC_ELT (m_return_edges, i, e)
+  FOR_EACH_VEC_ELT (m_supernodes, i, e)
 {
   if (i > 0)
pp_string (pp, ", ");
   pp_printf (pp, "(SN: %i -> SN: %i in %s)",
-e->m_src->m_index, e->m_dest->m_index,
-function_name (e->m_dest->m_fun));
+e->first->m_index, e->second->m_index,
+function_name (e->second->m_fun));
 }
 
   pp_string (pp, "]");
@@ -109,22 +109,22 @@ call_string::print (pretty_printer *pp) const
[{"src_snode_idx" : int,
  "dst_snode_idx" : int,
  "funcname" : str},
- ...for each return_superedge in the callstring].  */
+ ...for each std::pair in the 
callstring].  */
 
 json::value *
 call_string::to_json () const
 {
   json::array *arr = new json::array ();
 
-  for (const return_superedge *e : m_return_edges)
+  for (const std::pair *e : m_supernodes)
 {
   json::object *e_obj = new json::object ();
   e_obj->set ("src_snode_idx",
- new json::integer_number (e->m_src->m_index));
+ new json::integer_number (e->first->m_index));
   e_obj->set ("dst_snode_idx",
- new json::integer_number (e->m_dest->m_index));
+ new json::integer_number (e->second->m_index));
   e_obj->set ("funcname",
- new json::string (function_name (e->m_dest->m_fun)));
+ new json::string (function_name (e->second->m_fun)));
   arr->append (e_obj);
 }
 
@@ -137,7 +137,7 @@ hashval_t
 call_string::hash () const
 {
   inchash::hash hstate;
-  for (const return_superedge *e : m_return_edges)
+  for (const std::pair *e : m_supernodes)
 hstate.add_ptr (e);
   return hstate.end ();
 }
@@ -152,22 +152,40 @@ call_string::push_call (const supergraph &sg,
   gcc_assert (call_sedge);
   const return_superedge *return_sedge = call_sedge->get_edge_for_return (sg);
   gcc_assert (return_sedge);
-  m_return_edges.safe_push (return_sedge);
+  const std::pair *e = new 
(std::pair)
+{ return_sedge->m_src,
+  return_sedge->m_dest};
+  m_supernodes.safe_push (e);
+}
+
+void
+call_string::push_call (const supernode *src,
+  const supernode *dest)
+{
+  const std::pair *e = new 
(std::pair)
+{ src,
+  dest};
+  m_supernodes.safe_push (e);
+}
+
+const std::pair
+call_string::pop ()
+{
+  return *m_supernodes.pop();
 }
 
 /* Count the number of times the top-most call site appears in the
stack.  */
-
 int
 call_string::cal

[committed] Fix failure of asm goto tests on hppa

2021-07-10 Thread John David Anglin

The following change fixes failure of gcc.dg/torture/pr100329.c and 
gcc.dg/torture/pr100519.c on hppa.

Committed to trunk.

Dave

Require target lra for tests using asm goto

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr100329.c: Require target lra.
* gcc.dg/torture/pr100519.c: Likewise.

diff --git a/gcc/testsuite/gcc.dg/torture/pr100329.c 
b/gcc/testsuite/gcc.dg/torture/pr100329.c
index b90700dd5f0..2a4331ba712 100644
--- a/gcc/testsuite/gcc.dg/torture/pr100329.c
+++ b/gcc/testsuite/gcc.dg/torture/pr100329.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target lra } } */
 /* { dg-additional-options "--param tree-reassoc-width=2" } */

 unsigned int a0;
diff --git a/gcc/testsuite/gcc.dg/torture/pr100519.c 
b/gcc/testsuite/gcc.dg/torture/pr100519.c
index faf6e240e08..89dff668a97 100644
--- a/gcc/testsuite/gcc.dg/torture/pr100519.c
+++ b/gcc/testsuite/gcc.dg/torture/pr100519.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target lra } } */
 /* { dg-additional-options "--param tree-reassoc-width=2" } */

 unsigned int foo_a1, foo_a2;

[PATCH] ipa-devirt: check precision mismatch of enum values [PR101396]

2021-07-10 Thread Xi Ruoyao via Gcc-patches

We are comparing enum values (in wide_int) to check ODR violation.
However, if we compare two wide_int values with different precision,
we'll trigger an assert, leading to ICE.  With enum-base introduced
in C++11, it's easy to sink into this situation.

To fix the issue, we need to explicitly check this kind of mismatch,
and emit a proper warning message if there is such one.

Bootstrapped & regtested on x86_64-linux-gnu.  Ok for trunk?

gcc/

PR ipa/101396
* ipa-devirt.c (ipa_odr_read_section): Compare the precision of
  enum values, and emit a warning if they mismatch.

gcc/testsuite/

PR ipa/101396
* g++.dg/lto/pr101396_0.C: New test.
* g++.dg/lto/pr101396_1.C: New test.
---
 gcc/ipa-devirt.c  |  9 +
 gcc/testsuite/g++.dg/lto/pr101396_0.C | 12 
 gcc/testsuite/g++.dg/lto/pr101396_1.C | 10 ++
 3 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/lto/pr101396_0.C
 create mode 100644 gcc/testsuite/g++.dg/lto/pr101396_1.C

diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index 8cd1100aba9..8deec75b2df 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -4193,6 +4193,8 @@ ipa_odr_read_section (struct lto_file_decl_data 
*file_data, const char *data,
  if (do_warning != -1 || j >= this_enum.vals.length ())
continue;
  if (strcmp (id, this_enum.vals[j].name)
+ || (val.get_precision() !=
+ this_enum.vals[j].val.get_precision())
  || val != this_enum.vals[j].val)
{
  warn_name = xstrdup (id);
@@ -4260,6 +4262,13 @@ ipa_odr_read_section (struct lto_file_decl_data 
*file_data, const char *data,
"name %qs differs from name %qs defined"
" in another translation unit",
this_enum.vals[j].name, warn_name);
+ else if (this_enum.vals[j].val.get_precision() !=
+  warn_value.get_precision())
+   inform (this_enum.vals[j].locus,
+   "name %qs is defined as %u-bit while another "
+   "translation unit defines it as %u-bit",
+   warn_name, this_enum.vals[j].val.get_precision(),
+   warn_value.get_precision());
  /* FIXME: In case there is easy way to print wide_ints,
 perhaps we could do it here instead of overflow check.  */
  else if (wi::fits_shwi_p (this_enum.vals[j].val)
diff --git a/gcc/testsuite/g++.dg/lto/pr101396_0.C 
b/gcc/testsuite/g++.dg/lto/pr101396_0.C
new file mode 100644
index 000..b7a2947a880
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/pr101396_0.C
@@ -0,0 +1,12 @@
+/* { dg-lto-do link } */
+
+enum A : __UINT32_TYPE__ { // { dg-lto-warning "6: type 'A' violates the 
C\\+\\+ One Definition Rule" }
+  a, // { dg-lto-note "3: name 'a' is defined as 32-bit while another 
translation unit defines it as 64-bit" }
+  b,
+  c
+};
+
+int main()
+{
+  return (int) A::a;
+}
diff --git a/gcc/testsuite/g++.dg/lto/pr101396_1.C 
b/gcc/testsuite/g++.dg/lto/pr101396_1.C
new file mode 100644
index 000..a6d032d694d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/pr101396_1.C
@@ -0,0 +1,10 @@
+enum A : __UINT64_TYPE__ { // { dg-lto-note "6: an enum with different value 
name is defined in another translation unit" }
+  a, // { dg-lto-note "3: mismatching definition" }
+  b,
+  c
+};
+
+int f(enum A x)
+{
+  return (int) x;
+}
-- 
2.32.0

Re: rs6000: Generate an lxvp instead of two adjacent lxv instructions

2021-07-10 Thread segher

On Fri, Jul 09, 2021 at 06:14:49PM -0500, Peter Bergner wrote:
> Ok, I removed the consecutive_mem_locations() function from the previous
> patch and just call adjacent_mem_locations() directly now.  I also moved
> rs6000_split_multireg_move() to later in the file to fix the declaration
> issue.  However, since rs6000_split_multireg_move() is where the new code
> was added to emit the lxvp's, it can be hard to see what I changed because
> of the move.  I'll note that all of my changes are restrictd to within the
> 
>   if (GET_CODE (src) == UNSPEC)
> {
>   gcc_assert (XINT (src, 1) == UNSPEC_MMA_ASSEMBLE);
> ...
>   }
> 
> ...code section.  Does this look better?  I'm currently running bootstraps
> and regtests on LE and BE.

It is very hard to see the differences now.  Don't fold the changes into
one patch, just have the code movement in a separate trivial patch, and
then the actual changes as a separate patch?  That way it is much easier
to review :-)

> +   unsigned subreg =
> + (WORDS_BIG_ENDIAN) ? i : (nregs - reg_mode_nregs - i);

This is not new code, but it caught my eye, so just for the record: the
"=" should start a new line:
  unsigned subreg
= WORDS_BIG_ENDIAN ? i : (nregs - reg_mode_nregs - i);
(and don't put parens around random words please :-) ).


> +   int nvecs = XVECLEN (src, 0);
> +   for (int i = 0; i < nvecs; i++)
> + {
> +   rtx opnd;

Just "op" (and "op2") please?  If you use long names you might as well
just spell "operand" :-)

> +   if (WORDS_BIG_ENDIAN)
> + opnd = XVECEXP (src, 0, i);
> +   else
> + opnd = XVECEXP (src, 0, nvecs - i - 1);

Put this together with the case below as well?  Probably keep the
WORDS_BIG_ENDIAN test as the outer "if"?

> +   /* If we are loading an even VSX register and the memory location
> +  is adjacent to the next register's memory location (if any),
> +  then we can load them both with one LXVP instruction.  */
> +   if ((regno & 1) == 0)
> + {
> +   if (WORDS_BIG_ENDIAN)
> + {
> +   rtx opnd2 = XVECEXP (src, 0, i + 1);
> +   if (adjacent_mem_locations (opnd, opnd2) == opnd)
> + {
> +   opnd = adjust_address (opnd, OOmode, 0);
> +   /* Skip the next register, since we're going to
> +  load it together with this register.  */
> +   i++;
> + }
> + }
> +   else
> + {
> +   rtx opnd2 = XVECEXP (src, 0, nvecs - i - 2);
> +   if (adjacent_mem_locations (opnd2, opnd) == opnd2)
> + {
> +   opnd = adjust_address (opnd2, OOmode, 0);
> +   /* Skip the next register, since we're going to
> +  load it together with this register.  */
> +   i++;
> + }
> + }
> + }

I think it is fine now, but please factor the patch and repost.  Thanks!


Segher

[PATCH] move the (a-b) CMP 0 ? (a-b) : (b-a) optimization from fold_cond_expr_with_comparison to match

2021-07-10 Thread apinski--- via Gcc-patches

From: Andrew Pinski 

This patch moves the (a-b) CMP 0 ? (a-b) : (b-a) optimization
from fold_cond_expr_with_comparison to match.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* match.pd ((A-B) CMP 0 ? (A-B) : (B - A)):
New patterns.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/phi-opt-25.c: New test.
---
 gcc/match.pd   | 48 --
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25.c | 45 
 2 files changed, 90 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 30680d488ab..aa88381fdcb 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4040,9 +4040,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (cnd (logical_inverted_value truth_valued_p@0) @1 @2)
   (cnd @0 @2 @1)))
 
-/* abs/negative simplifications moved from fold_cond_expr_with_comparison,
-   Need to handle (A - B) case as fold_cond_expr_with_comparison does.
-   Need to handle UN* comparisons.
+/* abs/negative simplifications moved from fold_cond_expr_with_comparison.
 
None of these transformations work for modes with signed
zeros.  If A is +/-0, the first two transformations will
@@ -4098,6 +4096,50 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(convert (negate (absu:utype @0
(negate (abs @0)
  )
+
+ /* (A - B) == 0 ? (A - B) : (B - A)same as (B - A) */
+ (for cmp (eq uneq)
+  (simplify
+   (cnd (cmp (minus@0 @1 @2) zerop) @0 (minus@3 @2 @1))
+(if (!HONOR_SIGNED_ZEROS (type))
+ @3))
+  (simplify
+   (cnd (cmp (minus@0 @1 @2) zerop) integer_zerop (minus@3 @2 @1))
+(if (!HONOR_SIGNED_ZEROS (type))
+ @3))
+  (simplify
+   (cnd (cmp @1 @2) integer_zerop (minus@3 @2 @1))
+(if (!HONOR_SIGNED_ZEROS (type))
+ @3))
+ )
+ /* (A - B) != 0 ? (A - B) : (B - A)same as (A - B) */
+ (for cmp (ne ltgt)
+  (simplify
+   (cnd (cmp (minus@0 @1 @2) zerop) @0 (minus @2 @1))
+(if (!HONOR_SIGNED_ZEROS (type))
+ @0))
+ )
+ /* (A - B) >=/> 0 ? (A - B) : (B - A)same as abs (A - B) */
+ (for cmp (ge gt)
+  (simplify
+   (cnd (cmp (minus@0 @1 @2) zerop) @0 (minus @2 @1))
+(if (!HONOR_SIGNED_ZEROS (type)
+&& !TYPE_UNSIGNED (type))
+ (abs @0
+ /* (A - B) <=/< 0 ? (A - B) : (B - A)same as -abs (A - B) */
+ (for cmp (le lt)
+  (simplify
+   (cnd (cmp (minus@0 @1 @2) zerop) @0 (minus @2 @1))
+(if (!HONOR_SIGNED_ZEROS (type)
+&& !TYPE_UNSIGNED (type))
+ (if (ANY_INTEGRAL_TYPE_P (type)
+ && !TYPE_OVERFLOW_WRAPS (type))
+  (with {
+   tree utype = unsigned_type_for (type);
+   }
+   (convert (negate (absu:utype @0
+   (negate (abs @0)
+ )
 )
 
 /* -(type)!A -> (type)A - 1.  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25.c
new file mode 100644
index 000..0f0e3170f8d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25.c
@@ -0,0 +1,45 @@
+/* { dg-options "-O2 -fno-signed-zeros -fdump-tree-phiopt" } */
+int minus1(int a, int b)
+{
+  int c = a - b;
+  if (c == 0) c = b - a;
+  return c;
+}
+int minus2(int a, int b)
+{
+  int c = a - b;
+  if (c != 0) c = b - a;
+  return c;
+}
+int minus3(int a, int b)
+{
+  int c = a - b;
+  if (c == 0) c = 0;
+  else c = b - a;
+  return c;
+}
+int minus4(int a, int b)
+{
+  int c;
+  if (a == b) c = 0;
+  else
+c = b - a;
+  return c;
+}
+int abs0(int a, int b)
+{
+  int c = a - b;
+  if (c <= 0) c = b - a;
+  return c;
+}
+int negabs(int a, int b)
+{
+  int c = a - b;
+  if (c >= 0) c = b - a;
+  return c;
+}
+
+/* The above should be optimized at phiopt1 except for negabs which has to wait
+  until phiopt2 as -abs is not acceptable in early phiopt.  */
+/* { dg-final { scan-tree-dump-times "if" 1  "phiopt1"  } } */
+/* { dg-final { scan-tree-dump-not "if" "phiopt2" } } */
-- 
2.27.0

Re: [PATCH 0/2] RISC-V: Add ldr/str instruction for T-HEAD.

2021-07-10 Thread ALO via Gcc-patches

Hi,

Ping.

@Jim @kito

— Jojo
在 2021年7月9日 +0800 AM9:30，ALO ，写道：
> Hi,
>   Ping.
>
> — Jojo
> 在 2021年6月29日 +0800 PM4:11，Jojo R ，写道：
> > T-HEAD extends some customized ISAs for Cores.
> > The patches support ldr/str insns, it likes arm's LDR insn, the
> > memory model is a base register indexed by (optionally scaled) register.

[PATCH take 2] PR tree-optimization/38943: Preserve trapping instructions with -fpreserve-traps

Re: [PATCH] PR tree-opt/40210: Fold (bswap(X)>>C1)&C2 to (X>>C3)&C2 in match.pd

Re: [COMMITTED] Fix relation query of equivalences.

[PATCH 1/2] analyzer: refactor callstring to work with pairs of supernodes [GSoC]

[committed] Fix failure of asm goto tests on hppa

[PATCH] ipa-devirt: check precision mismatch of enum values [PR101396]

Re: rs6000: Generate an lxvp instead of two adjacent lxv instructions

[PATCH] move the (a-b) CMP 0 ? (a-b) : (b-a) optimization from fold_cond_expr_with_comparison to match

Re: [PATCH 0/2] RISC-V: Add ldr/str instruction for T-HEAD.

9 matches

Site Navigation

Mail list logo

Footer information