------- Comment #8 from nbenoit at tuxfamily dot org  2009-12-16 10:34 -------
I am confused, a performance regression is still noticeable:

* Intel Xeon E5320 (x86_64 arch but gcc machine is i686-pc-linux-gnu), with -O1
flag
GCC-4.4.2          7364 ms
GCC-trunk-r155286  9515 ms

* Intel Xeon 5160 (x86_64 arch and gcc machine is x86_64-linux-gnu), with -O1
flag
GCC-4.4.1          5960 ms
GCC-trunk-r155286  7355 ms


Here is a diff on the assembly generated for the Intel E5320:

$ diff 442/convol.s r155286/convol.s
11c11
<       subl    $8, %esp
---
>       subl    $12, %esp
13d12
<       movl    $H, %esi
17c16
<       imull   (%esi,%eax,4), %ebx
---
>       imull   H(,%eax,4), %ebx
22c21
<       jg      .L10
---
>       setle   %bl
24,25c23,25
<       jle     .L3
< .L10:
---
>       setle   -21(%ebp)
>       testb   %bl, -21(%ebp)
>       jne     .L3
28c28
< .L6:
---
> .L5:
31,32c31,32
<       je      .L5
< .L8:
---
>       je      .L4
> .L7:
34c34
<       js      .L6
---
>       js      .L5
40c40
< .L5:
---
> .L4:
43c43
<       je      .L7
---
>       je      .L6
46,47c46,47
<       jmp     .L8
< .L7:
---
>       jmp     .L7
> .L6:
50c50
<       addl    $8, %esp
---
>       addl    $12, %esp
60c60
<       .ident  "GCC: (GNU) 4.4.2"
---
>       .ident  "GCC: (GNU) 4.5.0 20091216 (experimental)"


-- 

nbenoit at tuxfamily dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027

Reply via email to