------- Comment #2 from rguenth at gcc dot gnu dot org 2009-12-17 14:26 ------- It is DOM which threads over the check in bb 12:
<bb 11>: __y_90 = __y_108; D.2866_92 = __y_108->_M_value_field.first; if (D.2866_92 > 0) goto <bb 20>; else goto <bb 12>; <bb 12>: # SR.42_94 = PHI <SR.42_88(11)> if (SR.42_94 != &m._M_t._M_impl._M_header) goto <bb 13>; else goto <bb 19>; <bb 13>: # SR.42_78 = PHI <SR.42_94(12), SR.42_87(20)> D.2308_34 = (const struct _Rb_tree_node *) SR.42_78; D.2315_10 = &D.2308_34->_M_value_field.second; __comp_ctor (&s, D.2315_10); changing bb 11 to <bb 11>: __y_90 = __y_108; D.2837_92 = __y_108->_M_value_field.first; if (D.2837_92 > 0) goto <bb 19>; else goto <bb 13>; which looks correct but one wonders why it's only done with -fipa-cp-clone. This change is what in the end triggers the bug, so -fno-tree-dominator-opts fixes it as well. The DOM thing is a missed-optimization in the -flto case. In the LTO case we thread <bb 10>: # __y_108 = PHI <__y_95(9), __y_75(5)> SR.42_88 = (struct _Rb_tree_node_base *) __y_108; if (SR.42_88 == &m._M_t._M_impl._M_header) goto <bb 12>; else goto <bb 11>; <bb 11>: __y_90 = __y_108; D.2837_92 = __y_90->_M_value_field.first; if (D.2837_92 > 0) goto <bb 12>; else goto <bb 13>; <bb 12>: <bb 13>: # SR.42_94 = PHI <SR.42_88(11), &m._M_t._M_impl._M_header(12)> if (SR.42_94 != &m._M_t._M_impl._M_header) goto <bb 14>; else goto <bb 20>; <bb 14>: D.2308_34 = (const struct _Rb_tree_node *) SR.42_94; D.2315_10 = &D.2308_34->_M_value_field.second; __comp_ctor (&s, D.2315_10); ... <bb 20>: return; } to <bb 10>: # __y_108 = PHI <__y_95(9), __y_75(5)> SR.42_88 = (struct _Rb_tree_node_base *) __y_108; if (SR.42_88 == &m._M_t._M_impl._M_header) goto <bb 20>; else goto <bb 11>; <bb 11>: __y_90 = __y_108; D.2866_92 = __y_108->_M_value_field.first; if (D.2866_92 > 0) goto <bb 20>; else goto <bb 12>; <bb 12>: # SR.42_94 = PHI <SR.42_88(11)> if (SR.42_94 != &m._M_t._M_impl._M_header) goto <bb 13>; else goto <bb 19>; <bb 13>: # SR.42_78 = PHI <SR.42_94(12), SR.42_87(20)> D.2308_34 = (const struct _Rb_tree_node *) SR.42_78; D.2315_10 = &D.2308_34->_M_value_field.second; __comp_ctor (&s, D.2315_10); which is 1) incomplete as noted above and 2) wrong, as the target BB 20 no longer is BB 20 but BB 19 ... ... <bb 19>: return; <bb 20>: # SR.42_87 = PHI <&m._M_t._M_impl._M_header(11), &m._M_t._M_impl._M_header(10)> goto <bb 13>; } thus somehow jump threading is messed up. Good threading: Jump threading proved probability of edge 13->20 too small (it is 3237, should be 10000). Threaded jump 11 --> 13 to 13 Threaded jump 12 --> 13 to 22 Removing basic block 12 Merging blocks 13 and 14 Removing basic block 21 Bad threading: Threaded jump 12 --> 13 to 22 Removing basic block 12 Removing basic block 21 I can't see how this is not a latent problem with jump-threading independent of LTO. -- rguenth at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- BugsThisDependsOn|39604 | Keywords| |missed-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42401