On Wed, Jul 6, 2016 at 5:00 AM, Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> wrote: > On 4 July 2016 at 13:51, Andrew Pinski <pins...@gmail.com> wrote: >> On Mon, Jul 4, 2016 at 12:58 AM, Prathamesh Kulkarni >> <prathamesh.kulka...@linaro.org> wrote: >>> Hi, >>> I have attached a "quick and dirty" prototype patch (var-partition-1.diff), >>> that attempts to partition variables to reduce number of >>> external references and to increase usage of section-anchors >>> to CSE address computation of global variables. >>> >>> We could put a variable in a partition that has max references for it, >>> however it doesn't lend itself directly to section anchor optimization. >>> For instance if a partition has max references for variables 'a' and 'b', >>> but no function in that partition references both 'a', and 'b' then AFAIU >>> it doesn't make any difference from section anchors perspective to have them >>> in same partition. >>> >>> The patch tries to assign a set of variables (>= 2) >>> to a partition whose functions have maximum references for that set. >>> Functions within the partition that reference the variables >>> in the set can take advantage of section-anchors. Functions >>> referencing the variables in the set outside the partition >>> would need to load them as external references (using movw/movt), >>> however since we are placing the set in partition that has maximal >>> references for it, number of external references should be overall >>> reduced. >>> >>> Partitioning is gated by -flto-var-partition and enabled >>> only for arm and aarch64. >> >> Why only for arm and aarch64? Shouldn't it be enabled for all section >> anchor targets? > AFAIK the only targets supporting section anchors are arm, aarch64 and > powerpc. > I didn't enable it for ppc64 because I am not sure how much profitable > it is for that target. > Honza mentioned to me some time back that effect of partitioning on > powerpc was nearly zero.
No MIPS has section anchors enabled too. Plus MIPS will benefit the same way as AARCH64 and ARM. PowerPC32 would too. I don't think it is correct to enable it only for arm and aarch64. Thanks, Andrew Pinski > > Thanks, > Prathamesh >> >> Thanks, >> Andrew >> >>> As per previous discussion [1], I haven't >>> touched function partitioning. Does this approach look ok >>> especially regarding correctness ? >>> So far, I have cross-tested patch on arm*-*-*, aarch64*-*-*. >>> >>> I haven't yet managed to benchmark the patch. >>> As a cheap measurement, I tried to measure number of external >>> references with and without patch by writing a small ipa pass >>> which is run during ltrans and simply walks over varpool nodes >>> and counts number of varpool_nodes for which DECL_EXTERNAL (vnode->decl) is >>> true >>> and vnode->definition is 0. Is that sufficient condition to determine >>> if variable is externally defined ? I have attached the pass >>> (count-external-refs.diff) >>> and the comparison done with it for for SPEC2000 [2]. The entries >>> in "before" and "after" column contain summation of number of >>> external refs (total_count) across all partitions before and after applying >>> the patch. Does the comparison hold any merit ? >>> I was wondering if we could we use a better way for >>> measuring statically the effects of variable partitioning ? >>> I hope also to get done with benchmarking soon. >>> >>> I have not yet figured out how to integrate it with existing cost metrics >>> for >>> balanced partitioning, I am looking into that. >>> I would be grateful for suggestions on the patch. >>> >>> [1] https://gcc.gnu.org/ml/gcc/2016-04/msg00090.html >>> >>> [2] SPEC2000 comparison: >>> https://docs.google.com/spreadsheets/d/1xnszyw04ksoyBspmCVYesq6KARLw-PA2n3T4aoaKdYw/edit?usp=sharing