http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721
--- Comment #20 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Unfortunately the patch regresses abi_check in libstdc++, libstdc++.so.6 now because of the patch exports _ZNSt12system_errorC1ESt10error_codeRKSs@@GLIBCXX_3.4.11 _ZNSt12system_errorC2ESt10error_codeRKSs@@GLIBCXX_3.4.11 symbols. This is caused by wastly different inlining decisions (and the decisions don't look like an improvement to me) on thread.cc. Debugging this, I'm seeing this is caused by Honza's predict.c changes. E.g. quite often we have tree_predict_by_opcode called on &__gthrw___pthread_key_create != 0. When we call expr_expected_value_1 on this, it returns a NE_EXPR (__gthrw___pthread_key_create is weakref), but with TREE_CONSTANT set on it. That is reasonable, the comparison will be either always true, or always false, but without explicit __builtin_expect from the user I actually don't see a reason why we should predict it as always true. So, I wonder if we shouldn't do on top of the #c19 patch: --- gcc/predict.c.jj 2014-03-14 13:16:15.000000000 +0100 +++ gcc/predict.c 2014-03-14 17:49:11.686411889 +0100 @@ -2066,7 +2066,7 @@ tree_predict_by_opcode (basic_block bb) visited = BITMAP_ALLOC (NULL); val = expr_expected_value_1 (boolean_type_node, op0, cmp, op1, visited, &predictor); BITMAP_FREE (visited); - if (val && TREE_CONSTANT (val)) + if (val && TREE_CODE (val) == INTEGER_CST) { if (predictor == PRED_BUILTIN_EXPECT) { That change seems to slightly improve the inlining decisions on thread.ii, e.g. _ZN9__gnu_cxxL27__exchange_and_add_dispatchEPii.part.1.constprop.11 seems to be inlined again, but _ZNSt12system_errorC1ESt10error_codeRKSs and _ZNSt12system_errorC2ESt10error_codeRKSs are still no longer inlined. Comparing to gcc with all the patches but all the predict.c #c19 and the above bits reverted, I see that (looking at -m32 -O2 ...) the _ZNSt12system_errorC1ESt10error_codeRKSs function (the other symbol is an alias to it) is 601 bytes long and _ZNSt6thread15_M_start_threadESt10shared_ptrINS_10_Impl_baseEE caller 406 bytes long with the predict.c changes and _ZNSt6thread15_M_start_threadESt10shared_ptrINS_10_Impl_baseEE is 889 bytes long with the predict.c changes reverted (the _ZNSt12system_errorC1ESt10error_codeRKSs function is inlined then). So, size wise it is a win to inline it, but not significant, and the function is in any case larger than the caller. Honza, your thoughts on this?