https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120614
--- Comment #9 from Jan Hubicka <hubicka at ucw dot cz> --- > > as mentioned by Andrew, it is important to clone and also resolve indirect > > calls. Those auto-FDO 0 may prevent it from happening. > > It is easy to see in perf profile if the functions are cloned. > > > > My overall plan is to combine autofdo with guessed profile, when autofdo > > samples are missing (i.e. we have 0 at input). There is no 100% correct way > > to do so, that is why I am trying to first get benchmarking set up and kind > > of working only then start tampering with the profile generation. > > Thanks for the information. I tried re-creating the same configuration and the > results unfortunately is the same. I will look at the dumps further. I will look if I can preproduce that ICE with to_sreal. It means that the counts are not compatible, but it is not clear from the backtrace why. Note that I think the main problem is that the code producing BB profile does make autofdo0 counts for all BBs where it can not sucesfully propagate to which means that they are not optimized for performance later. For example, it tends to prevent unrolling. If loop was unroled in train run, we will not have enough info to determine its iteration count which and we may leave count 0 in the header of the loop. We need to fill in the data from static profile and indicate that in the count->quality () (i.e. GUESSED versus AFDO). I looked into the propagation algorithm yesterday and made it to propagate even if the info is not complete. This is not quite correct solution, but mitigates the problem and reduces the performance gap from 4% to 2% in my SPEC runs. diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc index e12b3048f20..9ebd1a203fe 100644 --- a/gcc/auto-profile.cc +++ b/gcc/auto-profile.cc @@ -1307,21 +1307,47 @@ afdo_propagate_edge (bool is_succ, bb_set *annotated_bb) total_known_count += AFDO_EINFO (e)->get_count (); num_edge++; } + if (dump_file) + { + fprintf (dump_file, "bb %i annotated %i dir %s edges %i, " + "unknown edges %i, known count ", + bb->index, is_bb_annotated (bb, *annotated_bb), + is_succ ? "succesors" : "predecessors", num_edge, num_unknown_edge); + total_known_count.dump (dump_file); + fprintf (dump_file, " bb count "); + bb->count.dump (dump_file); + fprintf (dump_file, "\n"); + } + if (total_known_count > bb->count) + { + if (dump_file) + { + fprintf (dump_file, " Updating count of bb %i ", bb->index); + bb->count.dump (dump_file); + fprintf (dump_file, " -> "); + total_known_count.dump (dump_file); + fprintf (dump_file, "\n"); + } + bb->count = total_known_count; + changed = true; + } /* Be careful not to annotate block with no successor in special cases. */ - if (num_unknown_edge == 0 && total_known_count > bb->count) + if (num_unknown_edge == 0 && num_edge + && !is_bb_annotated (bb, *annotated_bb)) { - bb->count = total_known_count; - if (!is_bb_annotated (bb, *annotated_bb)) - set_bb_annotated (bb, annotated_bb); + if (dump_file) + fprintf (dump_file, " Setting bb %i annotated\n", bb->index); + set_bb_annotated (bb, annotated_bb); changed = true; } else if (num_unknown_edge == 1 && is_bb_annotated (bb, *annotated_bb)) { if (bb->count > total_known_count) { - profile_count new_count = bb->count - total_known_count; - AFDO_EINFO(unknown_edge)->set_count(new_count); + profile_count new_count = bb->count - total_known_count; + AFDO_EINFO(unknown_edge)->set_count(new_count); +#if 0 if (num_edge == 1) { basic_block succ_or_pred_bb = is_succ ? unknown_edge->dest : unknown_edge->src; @@ -1332,12 +1358,41 @@ afdo_propagate_edge (bool is_succ, bb_set *annotated_bb) set_bb_annotated (succ_or_pred_bb, annotated_bb); } } +#endif } else AFDO_EINFO (unknown_edge)->set_count (profile_count::zero().afdo ()); + if (dump_file) + { + fprintf (dump_file, " Annotated edge %i->%i with count ", + unknown_edge->src->index, unknown_edge->dest->index); + AFDO_EINFO (unknown_edge)->get_count ().dump (dump_file); + fprintf (dump_file, "\n"); + } AFDO_EINFO (unknown_edge)->set_annotated (); changed = true; } + else if (total_known_count >= bb->count + && num_unknown_edge > 1 + && is_bb_annotated (bb, *annotated_bb)) + { + FOR_EACH_EDGE (e, ei, is_succ ? bb->succs : bb->preds) + { + gcc_assert (AFDO_EINFO (e) != NULL); + if (! AFDO_EINFO (e)->is_annotated ()) + { + AFDO_EINFO(e)->set_count (profile_count::zero().afdo ()); + AFDO_EINFO (e)->set_annotated (); + if (dump_file) + { + fprintf (dump_file, " Annotated edge %i->%i with count ", + e->src->index, e->dest->index); + AFDO_EINFO (unknown_edge)->get_count ().dump (dump_file); + fprintf (dump_file, "\n"); + } + } + } + } } return changed; } @@ -1471,6 +1526,8 @@ afdo_propagate (bb_set *annotated_bb) changed = true; afdo_propagate_circuit (*annotated_bb); } + if (changed && dump_file) + fprintf (dump_file, "Limit of 10 iterations reached\n"); } /* Propagate counts on control flow graph and calculate branch