http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721

--- Comment #20 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Unfortunately the patch regresses abi_check in libstdc++, libstdc++.so.6 now
because of the patch exports
_ZNSt12system_errorC1ESt10error_codeRKSs@@GLIBCXX_3.4.11
_ZNSt12system_errorC2ESt10error_codeRKSs@@GLIBCXX_3.4.11
symbols.
This is caused by wastly different inlining decisions (and the decisions don't
look like an improvement to me) on thread.cc.
Debugging this, I'm seeing this is caused by Honza's predict.c changes.
E.g. quite often we have tree_predict_by_opcode called on
&__gthrw___pthread_key_create != 0.  When we call expr_expected_value_1 on
this,
it returns a NE_EXPR (__gthrw___pthread_key_create is weakref), but with
TREE_CONSTANT set on it.  That is reasonable, the comparison will be either
always true, or always false, but without explicit __builtin_expect from the
user I actually don't see a reason why we should predict it as always true.

So, I wonder if we shouldn't do on top of the #c19 patch:
--- gcc/predict.c.jj    2014-03-14 13:16:15.000000000 +0100
+++ gcc/predict.c    2014-03-14 17:49:11.686411889 +0100
@@ -2066,7 +2066,7 @@ tree_predict_by_opcode (basic_block bb)
   visited = BITMAP_ALLOC (NULL);
   val = expr_expected_value_1 (boolean_type_node, op0, cmp, op1, visited,
&predictor);
   BITMAP_FREE (visited);
-  if (val && TREE_CONSTANT (val))
+  if (val && TREE_CODE (val) == INTEGER_CST)
     {
       if (predictor == PRED_BUILTIN_EXPECT)
     {
That change seems to slightly improve the inlining decisions on thread.ii, e.g.
_ZN9__gnu_cxxL27__exchange_and_add_dispatchEPii.part.1.constprop.11 seems to be
inlined again, but _ZNSt12system_errorC1ESt10error_codeRKSs and
_ZNSt12system_errorC2ESt10error_codeRKSs are still no longer inlined.
Comparing to gcc with all the patches but all the predict.c #c19 and the above
bits reverted, I see that (looking at -m32 -O2 ...) the
_ZNSt12system_errorC1ESt10error_codeRKSs function (the other symbol is an alias
to it) is 601 bytes long and
_ZNSt6thread15_M_start_threadESt10shared_ptrINS_10_Impl_baseEE caller
406 bytes long with the predict.c changes and
_ZNSt6thread15_M_start_threadESt10shared_ptrINS_10_Impl_baseEE is 889 bytes
long with the predict.c changes reverted (the
_ZNSt12system_errorC1ESt10error_codeRKSs function is inlined then).
So, size wise it is a win to inline it, but not significant, and the function
is in any case larger than the caller.

Honza, your thoughts on this?

Reply via email to