------- Comment #1 from changpeng dot fang at amd dot com 2010-07-15 17:20 ------- This is a piece of code that shows the two prefetches for b.
mulss %xmm4, %xmm5 addq $8, %rdx prefetcht0 96(%r11) prefetcht0 100(%r11) subss %xmm2, %xmm1 addss %xmm5, %xmm0 In collecting memory references for the loops, the array of the imagine part is put into the different group from that of the real part (and thus two prefetches are generated). eference 0x2d61e70: group 0x2d63630 (base REALPART_EXPR <*b_64(D)... Reference 0x2d615e0: group 0x2d40f40 (base IMAGPART_EXPR <*b_64(D)... I think that the base should be reduced to the same, with a offset of 4. So they can be in the same group. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44955