This is a 4 part patchkit to address various deficiencies in our DSE
implementation.
BZ33562 was the inspiration for this work. 33562 is a low priority
regression that's been around for a long time. Patch #1 addresses
33562, "aggregate DSE disabled" and also implements trimming of complex
assignment when just one half of it is dead.
The discussions last year with Richi, reviewing of bugs in both LLVM and
GCC's databases and code instrumentation resulted in patches 2-4.
Patch #2 implements trimming of CONSTRUCTOR initializations. This is
61912/77485. This gets the most static hits of all the improvements.
Patch #3 implements trimming of mem* calls. We trim from the front or
back of the store. This doesn't hit as much as #2, but still happens
quite often. There is no BZ for this deficiency.
Patch #4 adds the ability to look through loads which may read from the
same memory as the potentially dead store, but which can be proven only
read from currently dead bytes within the object. This hits just once
in the compiler & runtime libraries. But it does hit often in the
libstdc++ testsuite. There is no BZ for this deficiency.
There's dependencies as we walk forward in the patch kits. Each patch
has been bootstrapped & tested with its previous patch(es).
There is much more that could be done beyond the series of 4 patches in
this patchkit. Richi has pointed out that SRA and DSE could probably
share a lot of analysis and transformation code. There may even be
advantages to having the two optimizations integrated into a single
pass. I haven't investigated any of that yet (though we are using a bit
of code from SRA in this kit).
We also need to look at store sinking again. I saw a patch from Richi
back in July looked reasonable at a high level and would likely allow
resolution of a multiple BZs.
Jeff
- [PATCH 0/4] Improve DSE implementation Jeff Law
-