On 05/18/2014 02:59 PM, Jan Hubicka wrote:
Sandra,
This patch seems quite similar in purpose to the
remove_local_statics optimization that Mentor has proposed, although
the implementation is quite different. Here is the last version of
our patch, prepared by Bernd Schmidt last year:
https://gcc.gnu.org/ml/gcc-patches/2013-06/msg00317.html
Thanks for pointer, I did not notice this patch!
The approach is indeed very different. So the patch works on function basis
and cares only about local statics of functions that was not inlined?
Yes. I should probably mention here that we did the analysis and
initial implementation of this optimization 7+ years ago against GCC
4.2, and in some cases we were being conservative in deciding the
optimization was not valid because the information required for more
detailed analysis wasn't being collected in the right place back then,
etc.
The failing tests are remove-local-statics-{4,5,7,12,14b}.c.
+/* Verify that we don't eliminate a global static variable. */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "global_static" } } */
+
+static int global_static;
+
+int
+test1 (int x)
+{
+ global_static = x;
+
+ return global_static + x;
+}
here test1 optimizes into
global_static=x;
return x+x;
this makes global_static write only and thus it is correctly eliminated.
So I think this testcase is bogus.
Yes, I agree that this one was for a restriction of our implementation
approach.
+++ b/gcc/testsuite/gcc.dg/remove-local-statics-5.c
@@ -0,0 +1,24 @@
+/* Verify that we do not eliminate a static local variable whose uses
+ are dominated by a def when the function calls setjmp. */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "thestatic" } } */
+
+#include <setjmp.h>
+
+int
+foo (int x)
+{
+ static int thestatic;
+ int retval;
+ jmp_buf env;
+
+ thestatic = x;
+
+ retval = thestatic + x;
+
+ setjmp (env);
+
+ return retval;
+}
I belive this is similar case. I do not see setjmp changing anything here,
since
local optimizers turns retval = x+x;
What it was intended to test?
Hmmmm, I'm guessing this was some concern about invalid code motion
around a setjmp. Our original analysis document lists "F does not call
setjmp" as a requirement for the optimization, so this was probably a
case where we were being excessively conservative.
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/remove-local-statics-7.c
@@ -0,0 +1,19 @@
+/* Verify that we eliminate a static local variable where it is defined
+ along all paths leading to a use. */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "thestatic" } } */
+
+int
+test1 (int x)
+{
+ static int thestatic;
+
+ if (x < 0)
+ thestatic = x;
+ else
+ thestatic = -x;
+
+ return thestatic + x;
+}
Here we get after early optimizations:
int
test1 (int x)
{
static int thestatic;
int thestatic.0_5;
int thestatic.1_7;
int _8;
<bb 2>:
if (x_2(D) < 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
thestatic = x_2(D);
goto <bb 5>;
<bb 4>:
thestatic.0_5 = -x_2(D);
thestatic = thestatic.0_5;
<bb 5>:
thestatic.1_7 = thestatic;
_8 = thestatic.1_7 + x_2(D);
return _8;
}
and thus we still have bogus read from thestatic. Because my analysis works at
IPA level,
we won't benefit from fact that dom2 eventually cleans it up as:
int
test1 (int x)
{
static int thestatic;
int thestatic.0_5;
int thestatic.1_7;
int _8;
int prephitmp_10;
<bb 2>:
if (x_2(D) < 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
thestatic = x_2(D);
goto <bb 5>;
<bb 4>:
thestatic.0_5 = -x_2(D);
thestatic = thestatic.0_5;
<bb 5>:
# prephitmp_10 = PHI <x_2(D)(3), thestatic.0_5(4)>
thestatic.1_7 = prephitmp_10;
_8 = thestatic.1_7 + x_2(D);
return _8;
}
Richi, is there a way to teach early FRE to get this transformation?
I see it is a partial redundancy problem...
Hmmmm, bummer that we don't get this one for free. :-(
+/* Verify that we do not eliminate a static variable when it is declared
+ in a function that has nested functions. */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "thestatic" } } */
+
+int test1 (int x)
+{
+ static int thestatic;
+
+ int nested_test1 (int x)
+ {
+ return x + thestatic;
+ }
+
+ thestatic = x;
+
+ return thestatic + x + nested_test1 (x);
+}
Here we work hard enough to optimize test1 as:
int
test1 (int x)
{
static int thestatic;
int _4;
int _5;
<bb 2>:
thestatic = x_2(D);
_4 = x_2(D) + x_2(D);
_5 = _4 + _4;
return _5;
}
thus inlining nested_test1 during early optimization. This makes the removal
valid.
Yes. This is one we had to punt on due to the one-function-at-a-time
approach.
+/* Verify that we do not eliminate a static local variable if the function
+ containing it is inlined. */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "thestatic" } } */
+
+int
+test2 (int x)
+{
+ if (x < 0)
+ return 0;
+ else
+ return test1 (x - 1);
+}
+
+inline int
+test1 (int x)
+{
+ static int thestatic;
+ int y;
+
+ thestatic = x;
+
+ y = test2 (thestatic - 1);
+
+ return y + x;
+}
Here thestatic becomes write only during early optimization, so again we can
correctly eliminate it.
OK.
Sandra,
do you think you can drop the testcases that are not valid and commit the valid
one minus
remove-local-statics-7.c for which we can fill in enhancement request?
OK. Keep the original numbering or re-number them to fill up the holes
left by the deletions?
For cases like local-statics-7 your approach can be "saved" by adding simple
IPA analysis
to look for static vars that are used only by one function and keeping your DSE
code active
for them, so we can still get rid of this special case assignments during late
compilation.
I am however not quite convinced it is worth the effort - do you have some real
world
cases where it helps?
Um, the well-known benchmark. ;-)
I am rather thinking about cutting the passmanager queue once again after main
tree optimization and re-running IPA unreachable code removal after them. This
should help with rather common cases where we optimize out code as effect
of inlining.
This would basically mean running pass_all_optimizations from late IPA pass
and scheduling one extra fixup_cfg and perhaps DCE pass at begginig of
pass_all_optimizations.
Honza
-Sandra