2008/11/27 Stefan Dösinger <[EMAIL PROTECTED]>: > Because we have to verify the color fixup on 16+3 samplers each draw, and > with d3d10 there will be many more(Of course in d3d10 we may be lucky and > have all the formats natively in opengl without the need for shader > conversion) > My automated performance tests show a small, but reproducible performance > drop since my multiple pixel shader patches in some games. I have not yet > checked why exactly, but I suspect its that the collection of all the > stateblock data and the memcmp over the currently 72 byte ps_compile_args > structure simply takes longer than the old CompileShader check. If we keep > the color conversion information compact we can bring this structure down to > 10 bytes. > It will likely be more efficient to only store and compare the samplers that need a fixup in the first place then. For most shaders you'll end up only checking a bitmask against zero in that case. Note that with a 16bit encoding you'd still cut the 64 bytes the array currently uses in half, in which case the structure would easily fit into a typical cacheline, if that's even the issue.
> Quicktime uses YUV formats in d3d9. It doesn't do any D3D geometry drawing > with it(no windows driver supports that it seems), just blitting. I don't > want to be able to do D3D drawing with YUV textures, but I'd like to be able > to use the D3D shader conversion code with blitting should that become > necessary. > Fair enough.