On Fri, Jun 23, 2017 at 12:26 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Tue, Jun 20, 2017 at 12:36 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Tue, Jun 20, 2017 at 11:20 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>> On Fri, Jun 16, 2017 at 6:15 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>>> On Fri, Jun 16, 2017 at 11:21 AM, Richard Biener >>>> <richard.guent...@gmail.com> wrote: >>>>> On Mon, Jun 12, 2017 at 7:03 PM, Bin Cheng <bin.ch...@arm.com> wrote: >>>>>> Hi, >>>>>> For now, loop distribution handles variables used outside of loop as >>>>>> reduction. >>>>>> This is inaccurate because all partitions contain statement defining >>>>>> induction >>>>>> vars. >>>>> >>>>> But final induction values are usually not used outside of the loop... >>>> This is in actuality for induction variable which is used outside of the >>>> loop. >>>>> >>>>> What is missing is loop distribution trying to change partition order. >>>>> In fact >>>>> we somehow assume we can move a reduction across a detected builtin >>>>> (I don't remember if we ever check for validity of that...). >>>> Hmm, I am not sure when we can't. If there is any dependence between >>>> builtin/reduction partitions, it should be captured by RDG or PG, >>>> otherwise the partitions are independent and can be freely ordered as >>>> long as reduction partition is scheduled last? >>>>> >>>>>> Ideally we should factor out scev-propagation as a standalone interface >>>>>> which can be called when necessary. Before that, this patch simply >>>>>> workarounds >>>>>> reduction issue by checking if the statement belongs to all partitions. >>>>>> If yes, >>>>>> the reduction must be computed in the last partition no matter how the >>>>>> loop is >>>>>> distributed. >>>>>> Bootstrap and test on x86_64 and AArch64. Is it OK? >>>>> >>>>> stmt_in_all_partitions is not kept up-to-date during partition merging >>>>> and if >>>>> merging makes the reduction partition(s) pass the stmt_in_all_partitions >>>>> test your simple workaround doesn't work ... >>>> I think it doesn't matter because: >>>> A) it's really workaround for induction variables. In general, >>>> induction variables are included by all partition. >>>> B) After classify partition, we immediately fuses all reduction >>>> partitions. More stmt_in_all_partitions means we are fusing >>>> non-reduction partition with reduction partition, so the newly >>>> generated (stmt_in_all_partitions) are actually not reduction >>>> statements. The workaround won't work anyway even the bitmap is >>>> maintained. >>>>> >>>>> As written it's a valid optimization but can you please note it's >>>>> limitation in >>>>> some comment please? >>>> Yeah, I will add comment explaining it. >>> Comment added in new version patch. It also computes bitmap outside >>> now, is it OK? >> >> Ok. Can you add a testcase for this as well please? I think the >> series up to this >> is now fully reviewed, I defered 1/n (the new IFN) to the last one >> containing the >> runtime versioning. Can you re-post that (you can merge with the IFN patch) >> to apply after the series has been applied up to this? > Test case added.
Ok. > Thanks, > bin > 2017-06-20 Bin Cheng <bin.ch...@arm.com> > > * tree-loop-distribution.c (classify_partition): New parameter and > better handle reduction statement. > (rdg_build_partitions): Revise comment. > (distribute_loop): Compute statements in all partitions and pass it > to classify_partition. > > gcc/testsuite/ChangeLog > 2017-06-20 Bin Cheng <bin.ch...@arm.com> > > * gcc.dg/tree-ssa/ldist-26.c: New test.