At 03:28 PM 5/6/2005 +0200, Eric Auer wrote:

> The optimization is the one that tries to align EDI to an
> eight-byte boundary before the main REP MOVSD.  And that optimization only
> makes sense because once you align EDI, you commonly align ESI along with
> it, at least in the three areas to be optimized.

Very interesting. I think you cannot optimize for that (even though it
would allow fast access bursts) because that would require the move
distance to be a multiple of 8 bytes. However, if the distance is FOUND
to already be a multiple of 8 bytes, extra code (in the EMM386 int1587
handler and in the HIMEM memory copy function) could take care to do up
to 7 MOVSB before doing the main REP MOVSD.

In case you wondered why people might not pay as much attention as you want when you talk about optimization, this is a good example.


Let's do the basic arithmetic. Assume EDI and ESI are same alignment, as frequently occurs. If there is a memory move to perform of 334 bytes with EDI at alignment 7 modulo 8, what do I do?

Answer: I move one byte to get EDI alignment to eight-byte boundary, with 333 bytes left to move. I then move 82 DWORDs that are 8-byte aligned via REP MOVSD for the cache line optimization. Done, 5 bytes to go. I move 5 bytes to clear up the remainder. Result: 328/334 or 98% of all bytes are moved in an optimal pattern at 3x speed. Overhead? A few instructions. Design decisions? Movement over three transfers.

Worst case with large moves: assume ESI and EDI are random to each other (which is often untrue), and can be odd (far more unlikely to be true) Aligning EDI is a smaller performance optimization even without cache line moves, but I won't count that. 12.5% of the time a worst case transfer will enjoy a huge performance gain of almost three times normal, for all typical transfers.

Normal case, the gain happens 100% of the time. Everybody with a Pentium Pro, II, or III dances a jig of joy. Pentium 4? Maybe, optimization docs are unclear.

All right, that's enough for me. I'm not spending more time and attention to talk about additional optimizations with you.




------------------------------------------------------- This SF.Net email is sponsored by: NEC IT Guy Games. Get your fingers limbered up and give it your best shot. 4 great events, 4 opportunities to win big! Highest score wins.NEC IT Guy Games. Play to win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20 _______________________________________________ Freedos-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/freedos-devel

Reply via email to