On Aug 25, 2005, at 2:09 PM, Fariborz Jahanian wrote:
Compiled with -O1 -mdynamic-no-pic -march=pentium4 produces: pxor %xmm0, %xmm0 movsd %xmm0, 16(%eax) movsd %xmm0, 8(%eax) movsd %xmm0, (%eax)But following code results in 7% performance gain in eon as reported by one of Apple's performance people:movl $0, 16(%eax) movl $0, 20(%eax) movl $0, 8(%eax) movl $0, 12(%eax) movl $0, (%eax) movl $0, 4(%eax)
Actually that does not make sense for a real processor but then again this is x86 we are talking about. Because we have less complex instructions and less instructions
at that matter. -- Pinski
