Hi! On 2020-04-02T11:12:48+0200, Richard Biener <rguent...@suse.de> wrote: > On Wed, 1 Apr 2020, Jason Merrill wrote: > >> On 4/1/20 9:36 AM, Richard Biener wrote: >> > This does away with enabling -ffinite-loops at -O2+ for all languages >> > and instead enables it selectively for C++ only.
> I'm retesting the following [...] ..., which got pushed in commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae "c/94392 - only enable -ffinite-loops for C++". I pushed the attached in commit 4f6a0888de52a2e523a6fd4235fe7f8193819c3b 'Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases" [PR89713, PR94392]'. As can be observed in two nvptx offloading test cases regressing, 'apparently now again "empty oacc loops are" no longer "removed before expand"' (quoting myself from the commit log, adapting Tom's commit log snippet from the reverted commit). It's not obvious to me how the "finite loop" property discussed/changed in Richard's commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae "c/94392 - only enable -ffinite-loops for C++" relates to the previously observed optimization of removing "empty oacc loops [...] before expand" (after PR89713 commit c29c92c789d93848cc1c929838771bfc68cb272c "PR tree-optimization/89713 - Assume loop with an exit is finite"), but examining that in detail is for another day. Grüße Thomas ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
>From 4f6a0888de52a2e523a6fd4235fe7f8193819c3b Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <tho...@codesourcery.com> Date: Fri, 3 Apr 2020 10:07:16 +0200 Subject: [PATCH] Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases" [PR89713, PR94392] In response to PR94392 commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae "c/94392 - only enable -ffinite-loops for C++", this reverts PR89713 commit 00908992f2a78f213d227aea8dbab014a1361df0, as apparently now again "empty oacc loops are" no longer "removed before expand". libgomp/ PR tree-optimization/89713 PR c/94392 * testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Again expect 'bar.sync'. * testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Likewise. --- libgomp/ChangeLog | 8 ++++++++ .../libgomp.oacc-c-c++-common/pr85381-2.c | 20 ++++++++++++++++++- .../libgomp.oacc-c-c++-common/pr85381-4.c | 5 ++++- 3 files changed, 31 insertions(+), 2 deletions(-) diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog index 6c437930b02f..3716f559aa1c 100644 --- a/libgomp/ChangeLog +++ b/libgomp/ChangeLog @@ -1,3 +1,11 @@ +2020-04-03 Thomas Schwinge <tho...@codesourcery.com> + + PR tree-optimization/89713 + PR c/94392 + * testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Again expect + 'bar.sync'. + * testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Likewise. + 2020-03-31 Tobias Burnus <tob...@codesourcery.com> * target.c (GOMP_target_enter_exit_data): Handle PSET/MAP_POINTER. diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c index 2cb5b95949de..6570c64afff5 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c @@ -15,4 +15,22 @@ main (void) return 0; } -/* { dg-final { scan-assembler-times "bar.sync" 0 } } */ +/* Todo: Boths bar.syncs can be removed. + Atm we generate this dead code inbetween forked and joining: + + mov.u32 %r28, %ntid.y; + mov.u32 %r29, %tid.y; + add.u32 %r30, %r29, %r29; + setp.gt.s32 %r31, %r30, 19; + @%r31 bra $L2; + add.u32 %r25, %r28, %r28; + mov.u32 %r24, %r30; + $L3: + add.u32 %r24, %r24, %r25; + setp.le.s32 %r33, %r24, 19; + @%r33 bra $L3; + $L2: + + so the loop is not recognized as empty loop (which we detect by seeing if + joining immediately follows forked). */ +/* { dg-final { scan-assembler-times "bar.sync" 2 } } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c index e8a433ffc0a5..d955d79718df 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c @@ -21,4 +21,7 @@ main (void) return 0; } -/* { dg-final { scan-assembler-times "bar.sync" 0 } } */ +/* Atm, %ntid.y is broadcast from one loop to the next, so there are 2 bar.syncs + for that (the other two are there for the same reason as in pr85381-2.c). + Todo: Recompute %ntid.y instead of broadcasting it. */ +/* { dg-final { scan-assembler-times "bar.sync" 4 } } */ -- 2.25.1