On Fri, Apr 20, 2012 at 2:07 PM, Razya Ladelsky <ra...@il.ibm.com> wrote: > Hi, > > This patch handles duplicating of the last iteration correctly. > The current code always duplicates the complete "static" iteration from > the entry to the latch, > and then sets the number of iterations according to the pattern of the > loop (according to whether the cond before the body, or the body before > the cond). > > The correct way to go about is to not assume anything about the control > flow of the loop, and duplicate the last iteration only from > entry to the block consisting of the cond, that is the real last dynamic > iteration that would be executed. > The number of iterations then needs no special care for each loop pattern. > This was actually Zdenek's original intent by duplicating the last > iteration. > > This code allows us to remove the do_while cond, and gets PR46886 resolved > (instead of the solution suggested in > http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01642.html, > which handled the number of iterations according to the specific shape of > the loop) > > Bootstrap and testsuite pass successfully. > Testsuite with -ftree-parallelize-loops=4 shows only one regression which > is uninit-17.c test for warnings, which seems > unrelated to how the loop gets parallelized. > I also ran spec-2006, and it showed no regressions.
That looks definitely better. A few comments: gsi = gsi_after_labels (ex_bb); + exit = single_dom_exit (loop); exit gets initialized in three places now - are all of them needed? @@ -2187,10 +2155,9 @@ parallelize_loops (void) || loop_has_blocks_with_irreducible_flag (loop) || (loop_preheader_edge (loop)->src->flags & BB_IRREDUCIBLE_LOOP) /* FIXME: the check for vector phi nodes could be removed. */ - || loop_has_vector_phi_nodes (loop) + || loop_has_vector_phi_nodes (loop)) /* FIXME: transform_to_exit_first_loop does not handle not header-copied loops correctly - see PR46886. */ - || !do_while_loop_p (loop)) the comment should be removed, too. +bool bb_in_region_p (basic_block bb, basic_block* bbs, unsigned n_region) canonical formatting is bool bb_in_region_p (basic_block bb, basic_block* bbs, unsigned n_region) +{ + unsigned int n; + + for (n = 0; n < n_region; n++) + { + if (bb == bbs[n]) + return true; + } + return false; +} needs a function comment. Seems to be only used in + target= loop; + for (aloop = orig_loop->inner; aloop; aloop = aloop->next) + { + if (bb_in_region_p (aloop->header, region, n_region)) + { + cloop = duplicate_loop (aloop, target); + duplicate_subloops (aloop, cloop); + } + } still it's not static - but it has the same name as an inline function in sese.h with a different prototype. I suggest renaming it and making it static. Ok with these changes if you give Zdenek 24h to comment, too. Thanks, Richard. > 2012-04-20 Razya Ladelsky <ra...@il.ibm.com> > > PR tree-optimization/46886 > * tree-parloops.c (transform_to_exit_first_loop): Remove > setting of number of iterations according to the loop pattern. > Duplicate from entry to exit->src instead of loop->latch. > (pallelize_loops): Remove the condition preventing > do-while loops. > * tree-cfg.c (bool bb_in_region_p): New. > (gimple_duplicate_sese_tail): Adjust duplication of the > the subloops. > Adjust redirection of the duplicated iteration. > > > > O.K to commit? > Thanks, > Razya