Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2012-03-30 Thread Teresa Johnson
Pulling this one back as I have a better solution, patch coming shortly. Thanks, Teresa On Fri, Mar 16, 2012 at 3:33 PM, Teresa Johnson wrote: > > Ping - now that stage 1 is open, could someone review? > > Thanks, > Teresa > > On Sun, Dec 4, 2011 at 10:26 PM, Teresa Johnson wrote: > > Latest pa

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2012-03-16 Thread Teresa Johnson
Ping - now that stage 1 is open, could someone review? Thanks, Teresa On Sun, Dec 4, 2011 at 10:26 PM, Teresa Johnson wrote: > Latest patch which improves the efficiency as described below is > included here. Boostrapped and checked again with > x86_64-unknown-linux-gnu. Could someone review? >

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-09 Thread Xinliang David Li
The patch is good for google branches for now while waiting for upstream review. David On Sun, Dec 4, 2011 at 10:26 PM, Teresa Johnson wrote: > Latest patch which improves the efficiency as described below is > included here. Boostrapped and checked again with > x86_64-unknown-linux-gnu. Could s

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-04 Thread Teresa Johnson
Latest patch which improves the efficiency as described below is included here. Boostrapped and checked again with x86_64-unknown-linux-gnu. Could someone review? Thanks, Teresa 2011-12-04 Teresa Johnson * loop-unroll.c (decide_unroll_constant_iterations): Call loop unroll tar

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-04 Thread Teresa Johnson
On Fri, Dec 2, 2011 at 11:59 AM, Xinliang David Li wrote: > ; >> >> +/* Determine whether LOOP contains floating-point computation. */ >> +bool >> +loop_has_FP_comp(struct loop *loop) >> +{ >> +  rtx set, dest; > > This probably should be extended to detect other long latency > operations in the f

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Teresa Johnson
On Fri, Dec 2, 2011 at 11:36 AM, Andi Kleen wrote: > Teresa Johnson writes: > > Interesting optimization. I would be concerned a little bit > about compile time, does it make a measurable difference? I haven't measured compile time explicitly, but I don't it should, especially after I address yo

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Xinliang David Li
On Fri, Dec 2, 2011 at 11:36 AM, Andi Kleen wrote: > Teresa Johnson writes: > > Interesting optimization. I would be concerned a little bit > about compile time, does it make a measurable difference? > >> The attached patch detects loops containing instructions that tend to >> incur high LCP (loo

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Xinliang David Li
; > > +/* Determine whether LOOP contains floating-point computation. */ > +bool > +loop_has_FP_comp(struct loop *loop) > +{ > +  rtx set, dest; This probably should be extended to detect other long latency operations in the future. > + > +  if (ix86_tune != PROCESSOR_COREI7_64 && > +      ix86_

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Andi Kleen
Teresa Johnson writes: Interesting optimization. I would be concerned a little bit about compile time, does it make a measurable difference? > The attached patch detects loops containing instructions that tend to > incur high LCP (loop changing prefix) stalls on Core i7, and limits > their unrol

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Teresa Johnson
Thanks, Andreas. You are right in that fully peeling a loop is done by a different code path (peel_loops_completely() and earlier in the tree unroller). Teresa On Fri, Dec 2, 2011 at 12:54 AM, Andreas Krebbel wrote: > On Thu, Dec 01, 2011 at 11:39:36PM -0800, Teresa Johnson wrote: >> To do this

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Andreas Krebbel
On Thu, Dec 01, 2011 at 11:39:36PM -0800, Teresa Johnson wrote: > To do this I leveraged the existing TARGET_LOOP_UNROLL_ADJUST target > hook, which was previously only defined for s390. I added one > additional call to this target hook, when unrolling for constant trip > count loops. Previously it

[Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-01 Thread Teresa Johnson
The attached patch detects loops containing instructions that tend to incur high LCP (loop changing prefix) stalls on Core i7, and limits their unroll factor to try to keep the unrolled loop body small enough to fit in the Corei7's loop stream detector which can hide LCP stalls in loops. To do thi