------- Comment #5 from sfilippone at uniroma2 dot it  2009-11-19 19:42 -------
(In reply to comment #4)
> Subject: Re:  [4.4/4.5 Regression] Vectorizer
>  cannot deal with PAREN_EXPR gracefully, 50% performance regression
> 
> 
> Heh, with -fwhole-program GCC optimizes the test away and I get 0.0s
> runtime.
> 
Not too surprising, after all this was extracted to make the test case
manageable, the original code is not pointless..:-)

> Well, within eval there's nothing really obvious to me.  The
> innermost loop is exactly the same:
> 
> .L39:
>         movsd   (%r15), %xmm0
>         addq    %rsi, %r15
>         subsd   (%rdx), %xmm0
>         addq    %rsi, %rdx
>         subl    $1, %eax
>         mulsd   %xmm0, %xmm0
>         addsd   %xmm0, %xmm1
>         jne     .L39
> 
> the next outer loop has some less loads in 4.5 but also different
> induction variables.  So - nothing obvious to me.
> 
Exactly, it's quite surprising to see a difference with such a simple loop. 
However the size of the generated assembler is different, so there must be
something... 

> Richard.
> 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

Reply via email to