On 7/9/19 11:56 AM, Jan Hubicka wrote: >> Hi. >> >> I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the >> basic blocks for which it makes the biggest sense. I quite some binary >> size reductions on SPEC2006 and SPEC2017. Speed numbers are also slightly >> positive. >> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. >> >> Ready to be installed? > The original idea of distinction between jump alignment and loop > alignment was that they have two basic meanings: > 1) jump alignment is there to avoid jumping just to the end of decode > window (if the window is aligned) so CPU will get stuck after reaching > the jump and also to possibly reduce code cache polution by populating > by code that is not executed > 2) loop alignment aims to fit loop in as few cache windows as possible > > Now if you have loop laid in a way that header of loop is not first > basic block, 2) IMO still apply. I.e. > > jump loop > :loopback > loop body > :loop > if cond jump to loopback > > So dropping loop alignment for those does not seem to make much sense > from high level. We may want to have differnt alignment for loops > starting by header and loops starting in the middle,
That's quite complicated condition, I would not introduce a new alignment. > but I still liked > more your patch which did bundles for loops. The patch caused regression for quite some benchmarks and has it's own problems (need of a recent GAS, not doing a bundle for bundles that can't fit in a single bundle window). For that reasons, I decided to not work on it any longer. Martin > > modern x86 chips are not very good testing targets on it. I guess > generic changes to alignment needs to be tested on other chips too. > > Honza >> Thanks, >> Martin >> >> gcc/ChangeLog: >> >> 2019-07-09 Martin Liska <mli...@suse.cz> >> >> * final.c (compute_alignments): Apply the LOOP_ALIGN only >> to basic blocks that all loop headers. >> --- >> gcc/final.c | 1 + >> 1 file changed, 1 insertion(+) >> >> > >> diff --git a/gcc/final.c b/gcc/final.c >> index fefc4874b24..ce2678da988 100644 >> --- a/gcc/final.c >> +++ b/gcc/final.c >> @@ -739,6 +739,7 @@ compute_alignments (void) >> if (has_fallthru >> && !(single_succ_p (bb) >> && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun)) >> + && bb->loop_father->header == bb >> && optimize_bb_for_speed_p (bb) >> && branch_count + fallthru_count > count_threshold >> && (branch_count >> > > >