On 05/18/2014 02:59 PM, Jan Hubicka wrote:
Sandra,
This patch seems quite similar in purpose to the
remove_local_statics optimization that Mentor has proposed, although
the implementation is quite different.  Here is the last version of
our patch, prepared by Bernd Schmidt last year:

https://gcc.gnu.org/ml/gcc-patches/2013-06/msg00317.html

Thanks for pointer, I did not notice this patch!
The approach is indeed very different.  So the patch works on function basis
and cares only about local statics of functions that was not inlined?

Yes. I should probably mention here that we did the analysis and initial implementation of this optimization 7+ years ago against GCC 4.2, and in some cases we were being conservative in deciding the optimization was not valid because the information required for more detailed analysis wasn't being collected in the right place back then, etc.

The failing tests are remove-local-statics-{4,5,7,12,14b}.c.

+/* Verify that we don't eliminate a global static variable.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "global_static" } } */
+
+static int global_static;
+
+int
+test1 (int x)
+{
+  global_static = x;
+
+  return global_static + x;
+}

here test1 optimizes into

   global_static=x;
   return x+x;

this makes global_static write only and thus it is correctly eliminated.
So I think this testcase is bogus.

Yes, I agree that this one was for a restriction of our implementation approach.

+++ b/gcc/testsuite/gcc.dg/remove-local-statics-5.c
@@ -0,0 +1,24 @@
+/* Verify that we do not eliminate a static local variable whose uses
+   are dominated by a def when the function calls setjmp.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "thestatic" } } */
+
+#include <setjmp.h>
+
+int
+foo (int x)
+{
+  static int thestatic;
+  int retval;
+  jmp_buf env;
+
+  thestatic = x;
+
+  retval = thestatic + x;
+
+  setjmp (env);
+
+  return retval;
+}

I belive this is similar case.  I do not see setjmp changing anything here, 
since
local optimizers turns retval = x+x;
What it was intended to test?

Hmmmm, I'm guessing this was some concern about invalid code motion around a setjmp. Our original analysis document lists "F does not call setjmp" as a requirement for the optimization, so this was probably a case where we were being excessively conservative.

--- /dev/null
+++ b/gcc/testsuite/gcc.dg/remove-local-statics-7.c
@@ -0,0 +1,19 @@
+/* Verify that we eliminate a static local variable where it is defined
+   along all paths leading to a use.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "thestatic" } } */
+
+int
+test1 (int x)
+{
+  static int thestatic;
+
+  if (x < 0)
+    thestatic = x;
+  else
+    thestatic = -x;
+
+  return thestatic + x;
+}

Here we get after early optimizations:

int
test1 (int x)
{
   static int thestatic;
   int thestatic.0_5;
   int thestatic.1_7;
   int _8;

   <bb 2>:
   if (x_2(D) < 0)
     goto <bb 3>;
   else
     goto <bb 4>;

   <bb 3>:
   thestatic = x_2(D);
   goto <bb 5>;

   <bb 4>:
   thestatic.0_5 = -x_2(D);
   thestatic = thestatic.0_5;

   <bb 5>:
   thestatic.1_7 = thestatic;
   _8 = thestatic.1_7 + x_2(D);
   return _8;

}

and thus we still have bogus read from thestatic.  Because my analysis works at 
IPA level,
we won't benefit from fact that dom2 eventually cleans it up as:
int
test1 (int x)
{
   static int thestatic;
   int thestatic.0_5;
   int thestatic.1_7;
   int _8;
   int prephitmp_10;

   <bb 2>:
   if (x_2(D) < 0)
     goto <bb 3>;
   else
     goto <bb 4>;

   <bb 3>:
   thestatic = x_2(D);
   goto <bb 5>;

   <bb 4>:
   thestatic.0_5 = -x_2(D);
   thestatic = thestatic.0_5;

   <bb 5>:
   # prephitmp_10 = PHI <x_2(D)(3), thestatic.0_5(4)>
   thestatic.1_7 = prephitmp_10;
   _8 = thestatic.1_7 + x_2(D);
   return _8;

}

Richi, is there a way to teach early FRE to get this transformation?
I see it is a partial redundancy problem...

Hmmmm, bummer that we don't get this one for free.  :-(

+/* Verify that we do not eliminate a static variable when it is declared
+   in a function that has nested functions.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "thestatic" } } */
+
+int test1 (int x)
+{
+  static int thestatic;
+
+  int nested_test1 (int x)
+  {
+    return x + thestatic;
+  }
+
+  thestatic = x;
+
+  return thestatic + x + nested_test1 (x);
+}

Here we work hard enough to optimize test1 as:
int
test1 (int x)
{
   static int thestatic;
   int _4;
   int _5;

   <bb 2>:
   thestatic = x_2(D);
   _4 = x_2(D) + x_2(D);
   _5 = _4 + _4;
   return _5;

}

thus inlining nested_test1 during early optimization. This makes the removal 
valid.

Yes. This is one we had to punt on due to the one-function-at-a-time approach.

+/* Verify that we do not eliminate a static local variable if the function
+   containing it is inlined.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "thestatic" } } */
+
+int
+test2 (int x)
+{
+  if (x < 0)
+    return 0;
+  else
+    return test1 (x - 1);
+}
+
+inline int
+test1 (int x)
+{
+  static int thestatic;
+  int y;
+
+  thestatic = x;
+
+  y = test2 (thestatic - 1);
+
+  return y + x;
+}

Here thestatic becomes write only during early optimization, so again we can 
correctly eliminate it.

OK.

Sandra,
do you think you can drop the testcases that are not valid and commit the valid 
one minus
remove-local-statics-7.c for which we can fill in enhancement request?

OK. Keep the original numbering or re-number them to fill up the holes left by the deletions?

For cases like local-statics-7 your approach can be "saved" by adding simple 
IPA analysis
to look for static vars that are used only by one function and keeping your DSE 
code active
for them, so we can still get rid of this special case assignments during late 
compilation.
I am however not quite convinced it is worth the effort - do you have some real 
world
cases where it helps?

Um, the well-known benchmark.  ;-)

I am rather thinking about cutting the passmanager queue once again after main
tree optimization and re-running IPA unreachable code removal after them. This
should help with rather common cases where we optimize out code as effect
of inlining.

This would basically mean running pass_all_optimizations from late IPA pass
and scheduling one extra fixup_cfg and perhaps DCE pass at begginig of
pass_all_optimizations.

Honza


-Sandra


Reply via email to