On Mon, Mar 21, 2005 at 01:45:19PM +0100, Richard Guenther wrote:
> I also cannot
> see why we zero the mm registers before loading and why we
> load them high/low separated:

We load hi/lo separate because movlps+movhps is faster than movups.

We zero first to break the insn dependency chain before doing two
half-register modifies.  IIRC such chain breaking is only relevant
to the p4.


r~

Reply via email to