Hi!

On 2022-10-21T00:44:30+0200, Aldy Hernandez <al...@redhat.com> wrote:
> On Thu, Oct 20, 2022 at 9:22 PM Thomas Schwinge <tho...@codesourcery.com> 
> wrote:
>> "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached?
>
> I see 7 different tests in this patch.  Did the 6 that pass, fail
> before my patch for PR107195 and are now working?   Cause unless
> that's the case, they shouldn't be in a test named pr107195-3.c, but
> somewhere else.

That's correct; I should've mentioned that I had verified this.  With the
code changes of commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0" reverted, we get:

    PASS: gcc.dg/tree-ssa/pr107195-3.c (test for excess errors)
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call 
<foo1," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call 
<foo2," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call 
<foo3," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call 
<foo4," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call 
<foo5," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call 
<foo6," 1

..., and in 'pr107195-3.c.196t.dom3' instead see two calls of each
'foo[...]' function.

That's with this...

> I see there's one XFAILed test in your patch

... XFAILed test case removed, see the attached
"Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]";
OK now to push that version?


> and this certainly
> doesn't look like something that has anything to do with the patch I
> submitted.  Perhaps you could open a PR with an enhancement request
> for this one?
>
> That being said...
>
> /* { dg-additional-options -O1 } */
> extern int
> __attribute__((const))
> foo4b (int);
>
> int f4b (unsigned int r)
> {
>   if (foo4b (r))
>     r *= 8U;
>
>   if ((r / 2U) & 2U)
>     r += foo4b (r);
>
>   return r;
> }
> /* { dg-final { scan-tree-dump-times {gimple_call <foo4b,} 1 dom3 {
> xfail *-*-* } } } */
>
> At -O2, this is something PRE is doing,  so GCC already handles this.
> However, you are suggesting this isn't handled at -O1 and should be??

My thinking was that this optimization does work for 'r >> 1', but it
doesn't work for 'r / 2'.

> None of the VRPs run at -O1 so ranger-vrp won't even get a chance.
> However, DOM runs at -O1 and it uses ranger to do simple copy
> propagation and some jump threading...so technically we could do
> something...
>
> DOM should be able to thread from the r *= 8U to the return because
> the nonzero mask (known zeros) after the multiplication is 0xfffffff8,
> which it could use to solve the second conditional as false.  This
> would leave us with:
>
> if (foo4b (r))
>   {
>     r *= 8U;
>    return r;
>   }
> else
>   {
>      if ((r / 2U) & 2U)
>        r += foo4b (r);
>   }
>
> ...which exposes the fact that the second call to foo4b() has the same
> "r" as the first one, so it could be folded.  I don't know whose job
> it is to notice that two const calls have the same arguments, but ISTM
> that if we thread the above correctly, someone should be able to clean
> this up.  No clue whether this happens at -O1.
>
> However... we're not threading this.  It looks like we're not keeping
> track of nonzero bits (known zeros) through the division.  The
> multiplication gives us 0xfffffff8 and we should be able to divide
> that by 2 and get 0x7ffffffc which solves the second conditional to 0.
>
> So...maybe DOM+ranger could set things up for another pass to clean this up?
>
> Either way, you could open an enhancement request, if anything to keep
> the nonzero mask up to date through the division.

I've thus filed <https://gcc.gnu.org/PR107342>
"Optimization opportunity where integer '/' corresponds to '>>'" for
continuing that investigation.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From e55e8569201c482507550eb56ff16aa3bbb48676 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <tho...@codesourcery.com>
Date: Mon, 17 Oct 2022 09:10:03 +0200
Subject: [PATCH] Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]

... to display optimization performed as of recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0".

	PR tree-optimization/107195
	gcc/testsuite/
	* gcc.dg/tree-ssa/pr107195-3.c: New.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c | 112 +++++++++++++++++++++
 1 file changed, 112 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
new file mode 100644
index 00000000000..eba4218b3c9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
@@ -0,0 +1,112 @@
+/* Inspired by 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'.  */
+
+/* { dg-additional-options -O1 } */
+/* { dg-additional-options -fdump-tree-dom3-raw } */
+
+
+extern int
+__attribute__((const))
+foo1 (int);
+
+int f1 (int r)
+{
+  if (foo1 (r)) /* If this first 'if' holds...  */
+    r *= 2; /* ..., 'r' now has a zero-value lower-most bit...  */
+
+  if (r & 1) /* ..., so this second 'if' can never hold...  */
+    { /* ..., so this is unreachable.  */
+      /* In constrast, if the first 'if' does not hold ('foo1 (r) == 0'), the
+	 second 'if' may hold, but we know ('foo1' being 'const') that
+	 'foo1 (r) == 0', so don't have to re-evaluate it here: */
+      r += foo1 (r);
+    }
+
+  return r;
+}
+/* Thus, if optimizing, we only ever expect one call of 'foo1'.
+   { dg-final { scan-tree-dump-times {gimple_call <foo1,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo2 (int);
+
+int f2 (int r)
+{
+  if (foo2 (r))
+    r *= 8;
+
+  if (r & 7)
+    r += foo2 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo2,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo3 (int);
+
+int f3 (int r)
+{
+  if (foo3 (r))
+    r <<= 4;
+
+  if ((r & 64) && ((r & 8) || (r & 4) || (r & 2) || (r & 1)))
+    r += foo3 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo3,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo4 (int);
+
+int f4 (int r)
+{
+  if (foo4 (r))
+    r *= 8;
+
+  if ((r >> 1) & 2)
+    r += foo4 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo4,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo5 (int);
+
+int f5 (int r) /* Works for both 'signed' and 'unsigned'.  */
+{
+  if (foo5 (r))
+    r *= 2;
+
+  if ((r % 2) != 0)
+    r += foo5 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo5,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo6 (int);
+
+int f6 (unsigned int r) /* 'unsigned' is important here.  */
+{
+  if (foo6 (r))
+    r *= 2;
+
+  if ((r % 2) == 1)
+    r += foo6 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo6,} 1 dom3 } } */
-- 
2.25.1

Reply via email to