Garrett Wollman wrote:
The problem is that the P4 is not very wide to begin with, and it's very
hard to optimize well for that 23-stage pipeline.

I'll say. I spent months tuning some assembly code for P3 and P4 and was quite disappointed that the P4 consistently required more CPU cycles for the same code.

Only the P4s faster clock kept it from actually being slower
than the P3.  I attribute a lot of that to the P4s long pipeline.

Tim

_______________________________________________
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to