http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53346
Uros Bizjak <ubizjak at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2012-05-17 Ever Confirmed|0 |1 --- Comment #3 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-17 18:29:12 UTC --- Confirmed, -O2 vs. -O2 -ftree-vectorize on x86_64: -O2 -ftree-vectorize: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 43.83 9.73 9.73 64 0.15 0.15 cptrf2_ 40.68 18.76 9.03 6685 0.00 0.00 trs2a2.2054 7.70 20.47 1.71 64 0.03 0.03 gentrs_ 1.49 20.80 0.33 64 0.01 0.01 cptrf1_ 1.40 21.11 0.31 1 0.31 12.33 matsim_ 1.40 21.42 0.31 6685 0.00 0.00 invima.2045 1.13 21.67 0.25 64 0.00 0.00 cmpcpt_ -O2: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 55.20 9.20 9.20 6685 0.00 0.00 trs2a2.2054 23.40 13.10 3.90 64 0.06 0.06 cptrf2_ 10.38 14.83 1.73 64 0.03 0.03 gentrs_ 2.58 15.26 0.43 64 0.01 0.01 cptrf1_ 2.34 15.65 0.39 6685 0.00 0.00 invima.2045 1.98 15.98 0.33 1 0.33 6.58 matsim_ 1.14 16.17 0.19 64 0.00 0.00 cmpcpt_ cptrf2_ runtime increased for almost 6 seconds! The only vectorization is in: 3530: LOOP VECTORIZED. rnflow.f90:3510: note: vectorized 1 loops in function. Which corresponds to: ! ______________________________________________________________________ real, dimension (1:nxtr), intent (in) :: xxtrt ! extrema integer, intent (in) :: nxtr ! leur nombre integer, dimension (1:nxtr), intent (out) :: ixtrt ! indices integer, intent (out) :: kerr ! code d'erreur ! ______________________________________________________________________ ! kerr = 0 ixtrt = 0 <<<<<<<<<<<<<< HERE This vectorization results in zeroing of certain memory area: pxor %xmm0, %xmm0 leaq (%rdx,%r8,4), %r8 xorl %esi, %esi .p2align 4,,10 .p2align 3 .L183: addq $1, %rsi movdqa %xmm0, (%r8) addq $16, %r8 cmpq %rsi, %r11 ja .L183 And this causes 6 second difference ?!