ralf willenbacher wrote:
>
> i replaced the transform_v16, matmul4 and matmul34 with simd instruction
> functions and there was no performance increase in quake2 or q3test.
> any good reason to have simd support anyway ? :(
>
Q3test is a big application. The matmul routines don't even register in the
profiling results, transform_v16 is probably only about 8% of the time spent in
mesa, which seems to take about half of the cpu (ie quake itself uses half, mesa
counts for the other half). No mesa function takes more than about 8%. You have
to improve the performance of several of them to make a noticable difference, or
come up with a change which removes a step altogther or some other system-wide
improvement.
Now that you've done a transform_v16, benchmark it against the standard x86
version, and the C version. It will be easier to see if you're making a
difference this way. If you are, look at the project routines (eg, in
src/FX/X86, and fxfasttmp.h). Holger's 3dnow versions of these were a big
improvement on the C.
One nice thing about simd, even if x86 simd *isn't* much faster than normal x86
floating point is the prefetch instructions (I assume sse has one). Use of these
should make a real difference.
Keith
_______________________________________________
Mesa-dev maillist - [EMAIL PROTECTED]
http://lists.mesa3d.org/mailman/listinfo/mesa-dev