https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79224
--- Comment #9 from Jan Hubicka <hubicka at gcc dot gnu.org> --- Most of the regression caused by the inlining difference is fixed now, but the solution is not ideal. According to Czerny we still have quite noticeable regression https://gcc.opensuse.org/c++bench-czerny/c-ray/ and code size is bigger. Morever we get noticeable code size growth in wave (808569->814865, 0.8%) and Botan (1780199->1789329) The other change is 160810 0.33 30089 13360 2281 160811 0.27 30089 12656 2560 Perhaps this patch might be a suspect, but I have no idea 2016-08-10 Yuri Rumyantsev <ysrum...@gmail.com> PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP, invoke ref_indep_loop_p_1. (outermost_indep_loop): Pass LOOP argumnet where REF was defined to ref_indep_loop_p. (ref_indep_loop_p_1): Fix commentary, add argument REF_LOOP, combine it with ref_indep_lopp_p_2, update SAFELEN if only REF is inside LOOP, do not cache dpendence value for loops with non-zero SAFELEN. (ref_indep_loop_p_2): Delete function. (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. One more important issue I noticed is that inline metric always compare the estimated runtime of offline copy with the runtime of specialized copy after inlining (with known constants and other context). This is OK for size metrics, but not OK for speed. The offline copy is run in the same context, in particular if some code is guarded by a conditional that is false, it is not executed and should not be acocunted to offline path. This makes the inline metric to be skewed toward inlining which eliminates large conditionals. I will fix that in next stage1, but I am not sure how much we can still do in current stage4. One observation is that overall runtime/size estimates after early opt has quite improved in last two releases where I did not do re-tunning of parameters. Perhaps it is time to tune down a bit early inlining and inline-insns-auto again...