Am Donnerstag, 9. Januar 2003 16:46 schrieb Dieter Nützel:
> Am Donnerstag, 9. Januar 2003 14:05 schrieb Petr Sebor:
> > Hello,
> >
> > somehow something got ommited, probably on my side.
> > [as I can tell from the end of the patch, since there are references
> > to removed code...]
> >
> > I'd like to send more cleanups if you are interested ;-)
>
> Come along with it.
>
> > Btw: Unfortunatelly, I have noticed that the current SSE code tends to
> > be a little slower than the x86 path, at least on my XP2000+... not very
> > much, but still.
>
> I second that.
>
> > Maybe this is given by the AMD SSE implementation
>
> Could be. As I understand the AMD SSE implementation right, it isn't an
> extra CPU core but it use the 3DNow! infrastructure.
>
> > or more likely, because we don't smartly precache data while we can.
>
> Precaching is NOT by all means the fastest patch.
> See AMD's developer notes/optimization guides.
>
> There was discussion on LKM about a new memcopy routine.
> http://marc.theaimsgroup.com/?l=linux-kernel&m=103548024914815&w=2
> [-]
> AMD recommends to perform memory copies with backward read operations
> instead of prefetch.
>
> http://208.15.46.63/events/gdc2002.htm
> [-]
>
> Some numbers:
> nuetzel/Entwicklung> ./athlon-DN
> 1600.156 MHz
> clear_page by 'normal_clear_page'        took 12400 cycles (504.1 MB/s)
> clear_page by 'slow_zero_page'           took 12412 cycles (503.6 MB/s)
> clear_page by 'fast_clear_page'          took 9811 cycles (637.1 MB/s)
> clear_page by 'faster_clear_page'        took 4239 cycles (1474.3 MB/s)
>
> copy_page by 'normal_copy_page'  took 9231 cycles (677.1 MB/s)
> copy_page by 'slow_copy_page'    took 9243 cycles (676.3 MB/s)
> copy_page by 'fast_copy_page'    took 8318 cycles (751.4 MB/s)
> copy_page by 'faster_copy'       took 5481 cycles (1140.3 MB/s)
> copy_page by 'even_faster'       took 5545 cycles (1127.2 MB/s)
> copy_page by 'no_prefetch'       took 4388 cycles (1424.3 MB/s)
>
> But maybe you should stop the CPU scanning on AMD after 3DNow!
> (normal/ext/prof) detection, so that AMD's run 3DNow! and not SSE.

Any progress?

Thanks,
        Dieter


-------------------------------------------------------
This SF.net email is sponsored by: Etnus, makers of TotalView, The debugger
for complex code. Debugging C/C++ programs can leave you feeling lost and
disoriented. TotalView can help you find your way. Available on major UNIX
and Linux platforms. Try it free. www.etnus.com
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to