------- Comment #5 from sfilippone at uniroma2 dot it 2009-11-19 19:42 ------- (In reply to comment #4) > Subject: Re: [4.4/4.5 Regression] Vectorizer > cannot deal with PAREN_EXPR gracefully, 50% performance regression > > > Heh, with -fwhole-program GCC optimizes the test away and I get 0.0s > runtime. > Not too surprising, after all this was extracted to make the test case manageable, the original code is not pointless..:-)
> Well, within eval there's nothing really obvious to me. The > innermost loop is exactly the same: > > .L39: > movsd (%r15), %xmm0 > addq %rsi, %r15 > subsd (%rdx), %xmm0 > addq %rsi, %rdx > subl $1, %eax > mulsd %xmm0, %xmm0 > addsd %xmm0, %xmm1 > jne .L39 > > the next outer loop has some less loads in 4.5 but also different > induction variables. So - nothing obvious to me. > Exactly, it's quite surprising to see a difference with such a simple loop. However the size of the generated assembler is different, so there must be something... > Richard. > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108