On 02/18/2016 02:56 AM, Richard Biener wrote:
On Wed, Feb 17, 2016 at 5:10 PM, Jeff Law <l...@redhat.com> wrote:
On 02/17/2016 07:13 AM, Richard Biener wrote:
- /* Continue walking until we reach a kill. */
- while (!stmt_kills_ref_p (temp, ref));
+ /* Continue walking until we reach a full kill as a single statement
+ or there are no more live bytes. */
+ while (!stmt_kills_ref_p (temp, ref)
+ && !(live_bytes && bitmap_empty_p (live_bytes)));
Just a short quick comment - the above means you only handle partial
stores
with no interveaning uses. You don't handle, say
struct S { struct R { int x; int y; } r; int z; } s;
s = { {1, 2}, 3 };
s.r.x = 1;
s.r.y = 2;
struct R r = s.r;
s.z = 3;
where s = { {1, 2}, 3} is still dead.
Right. But handling that has never been part of DSE's design goals. Once
there's a use, DSE has always given up.
Yeah, which is why I in the end said we need a "better" DSE ...
And coming back to this -- these kind of opportunities appear to be
rare. I found a couple in a GCC build and some in the libstdc++ testsuite.
From looking at how your test is currently handled, the combination of
SRA and early FRE tend to clean things up before DSE gets a chance.
That may account for the lack of "hits" for this improvement.
Regardless, the code is written (#4 in the recently posted series). I'm
going to add your test to the updated path (with SRA and FRE disabled
obviously) as an additional test given there's very little coverage of
this feature outside the libstdc++ testsuite.
Yeah, I think the case we're after and that happens most is sth like
a = { aggregate init };
a.a = ...;
a.b = ...;
...
and what you add is the ability to remove the aggregate init completely.
What would be nice to have is to remove it partly as well, as for
struct { int i; int j; int k; } a = {};
a.i = 1;
a.k = 3;
we'd like to remove the whole-a zeroing but we need to keep zeroing
of a.j.
I believe that SRA already has most of the analysis part, what it is
lacking is that SRA works not flow-sensitive (it just gathers
function-wide data) and that it doesn't consider base objects that
have their address taken or are pointer-based.
So that's handled by patch #2 and it's (by far) the most effective part
of this work in terms of hits and reducing the number of stored bytes.
Patch #2 has a few tests for this case and it is well exercised by a
bootstrap as well, I don't think your testcase provides any additional
coverage.
Jeff