Hi!
On 2022-10-21T00:44:30+0200, Aldy Hernandez <[email protected]> wrote:
> On Thu, Oct 20, 2022 at 9:22 PM Thomas Schwinge <[email protected]>
> wrote:
>> "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached?
>
> I see 7 different tests in this patch. Did the 6 that pass, fail
> before my patch for PR107195 and are now working? Cause unless
> that's the case, they shouldn't be in a test named pr107195-3.c, but
> somewhere else.
That's correct; I should've mentioned that I had verified this. With the
code changes of commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0" reverted, we get:
PASS: gcc.dg/tree-ssa/pr107195-3.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call
<foo1," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call
<foo2," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call
<foo3," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call
<foo4," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call
<foo5," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call
<foo6," 1
..., and in 'pr107195-3.c.196t.dom3' instead see two calls of each
'foo[...]' function.
That's with this...
> I see there's one XFAILed test in your patch
... XFAILed test case removed, see the attached
"Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]";
OK now to push that version?
> and this certainly
> doesn't look like something that has anything to do with the patch I
> submitted. Perhaps you could open a PR with an enhancement request
> for this one?
>
> That being said...
>
> /* { dg-additional-options -O1 } */
> extern int
> __attribute__((const))
> foo4b (int);
>
> int f4b (unsigned int r)
> {
> if (foo4b (r))
> r *= 8U;
>
> if ((r / 2U) & 2U)
> r += foo4b (r);
>
> return r;
> }
> /* { dg-final { scan-tree-dump-times {gimple_call <foo4b,} 1 dom3 {
> xfail *-*-* } } } */
>
> At -O2, this is something PRE is doing, so GCC already handles this.
> However, you are suggesting this isn't handled at -O1 and should be??
My thinking was that this optimization does work for 'r >> 1', but it
doesn't work for 'r / 2'.
> None of the VRPs run at -O1 so ranger-vrp won't even get a chance.
> However, DOM runs at -O1 and it uses ranger to do simple copy
> propagation and some jump threading...so technically we could do
> something...
>
> DOM should be able to thread from the r *= 8U to the return because
> the nonzero mask (known zeros) after the multiplication is 0xfffffff8,
> which it could use to solve the second conditional as false. This
> would leave us with:
>
> if (foo4b (r))
> {
> r *= 8U;
> return r;
> }
> else
> {
> if ((r / 2U) & 2U)
> r += foo4b (r);
> }
>
> ...which exposes the fact that the second call to foo4b() has the same
> "r" as the first one, so it could be folded. I don't know whose job
> it is to notice that two const calls have the same arguments, but ISTM
> that if we thread the above correctly, someone should be able to clean
> this up. No clue whether this happens at -O1.
>
> However... we're not threading this. It looks like we're not keeping
> track of nonzero bits (known zeros) through the division. The
> multiplication gives us 0xfffffff8 and we should be able to divide
> that by 2 and get 0x7ffffffc which solves the second conditional to 0.
>
> So...maybe DOM+ranger could set things up for another pass to clean this up?
>
> Either way, you could open an enhancement request, if anything to keep
> the nonzero mask up to date through the division.
I've thus filed <https://gcc.gnu.org/PR107342>
"Optimization opportunity where integer '/' corresponds to '>>'" for
continuing that investigation.
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
München, HRB 106955
>From e55e8569201c482507550eb56ff16aa3bbb48676 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <[email protected]>
Date: Mon, 17 Oct 2022 09:10:03 +0200
Subject: [PATCH] Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]
... to display optimization performed as of recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0".
PR tree-optimization/107195
gcc/testsuite/
* gcc.dg/tree-ssa/pr107195-3.c: New.
---
gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c | 112 +++++++++++++++++++++
1 file changed, 112 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
new file mode 100644
index 00000000000..eba4218b3c9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
@@ -0,0 +1,112 @@
+/* Inspired by 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'. */
+
+/* { dg-additional-options -O1 } */
+/* { dg-additional-options -fdump-tree-dom3-raw } */
+
+
+extern int
+__attribute__((const))
+foo1 (int);
+
+int f1 (int r)
+{
+ if (foo1 (r)) /* If this first 'if' holds... */
+ r *= 2; /* ..., 'r' now has a zero-value lower-most bit... */
+
+ if (r & 1) /* ..., so this second 'if' can never hold... */
+ { /* ..., so this is unreachable. */
+ /* In constrast, if the first 'if' does not hold ('foo1 (r) == 0'), the
+ second 'if' may hold, but we know ('foo1' being 'const') that
+ 'foo1 (r) == 0', so don't have to re-evaluate it here: */
+ r += foo1 (r);
+ }
+
+ return r;
+}
+/* Thus, if optimizing, we only ever expect one call of 'foo1'.
+ { dg-final { scan-tree-dump-times {gimple_call <foo1,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo2 (int);
+
+int f2 (int r)
+{
+ if (foo2 (r))
+ r *= 8;
+
+ if (r & 7)
+ r += foo2 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo2,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo3 (int);
+
+int f3 (int r)
+{
+ if (foo3 (r))
+ r <<= 4;
+
+ if ((r & 64) && ((r & 8) || (r & 4) || (r & 2) || (r & 1)))
+ r += foo3 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo3,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo4 (int);
+
+int f4 (int r)
+{
+ if (foo4 (r))
+ r *= 8;
+
+ if ((r >> 1) & 2)
+ r += foo4 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo4,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo5 (int);
+
+int f5 (int r) /* Works for both 'signed' and 'unsigned'. */
+{
+ if (foo5 (r))
+ r *= 2;
+
+ if ((r % 2) != 0)
+ r += foo5 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo5,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo6 (int);
+
+int f6 (unsigned int r) /* 'unsigned' is important here. */
+{
+ if (foo6 (r))
+ r *= 2;
+
+ if ((r % 2) == 1)
+ r += foo6 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo6,} 1 dom3 } } */
--
2.25.1