http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53346
Uros Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2012-05-17
Ever Confirmed|0 |1
--- Comment #3 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-17 18:29:12
UTC ---
Confirmed, -O2 vs. -O2 -ftree-vectorize on x86_64:
-O2 -ftree-vectorize:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
43.83 9.73 9.73 64 0.15 0.15 cptrf2_
40.68 18.76 9.03 6685 0.00 0.00 trs2a2.2054
7.70 20.47 1.71 64 0.03 0.03 gentrs_
1.49 20.80 0.33 64 0.01 0.01 cptrf1_
1.40 21.11 0.31 1 0.31 12.33 matsim_
1.40 21.42 0.31 6685 0.00 0.00 invima.2045
1.13 21.67 0.25 64 0.00 0.00 cmpcpt_
-O2:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
55.20 9.20 9.20 6685 0.00 0.00 trs2a2.2054
23.40 13.10 3.90 64 0.06 0.06 cptrf2_
10.38 14.83 1.73 64 0.03 0.03 gentrs_
2.58 15.26 0.43 64 0.01 0.01 cptrf1_
2.34 15.65 0.39 6685 0.00 0.00 invima.2045
1.98 15.98 0.33 1 0.33 6.58 matsim_
1.14 16.17 0.19 64 0.00 0.00 cmpcpt_
cptrf2_ runtime increased for almost 6 seconds!
The only vectorization is in:
3530: LOOP VECTORIZED.
rnflow.f90:3510: note: vectorized 1 loops in function.
Which corresponds to:
! ______________________________________________________________________
real, dimension (1:nxtr), intent (in) :: xxtrt ! extrema
integer, intent (in) :: nxtr ! leur nombre
integer, dimension (1:nxtr), intent (out) :: ixtrt ! indices
integer, intent (out) :: kerr ! code d'erreur
! ______________________________________________________________________
!
kerr = 0
ixtrt = 0 <<<<<<<<<<<<<< HERE
This vectorization results in zeroing of certain memory area:
pxor %xmm0, %xmm0
leaq (%rdx,%r8,4), %r8
xorl %esi, %esi
.p2align 4,,10
.p2align 3
.L183:
addq $1, %rsi
movdqa %xmm0, (%r8)
addq $16, %r8
cmpq %rsi, %r11
ja .L183
And this causes 6 second difference ?!