2008/11/27 Stefan Dösinger <[EMAIL PROTECTED]>:
> Because we have to verify the color fixup on 16+3 samplers each draw, and 
> with d3d10 there will be many more(Of course in d3d10 we may be lucky and 
> have all the formats natively in opengl without the need for shader 
> conversion)
> My automated performance tests show a small, but reproducible performance 
> drop since my multiple pixel shader patches in some games. I have not yet 
> checked why exactly, but I suspect its that the collection of all the 
> stateblock data and the memcmp over the currently 72 byte ps_compile_args 
> structure simply takes longer than the old CompileShader check. If we keep 
> the color conversion information compact we can bring this structure down to 
> 10 bytes.
>
It will likely be more efficient to only store and compare the
samplers that need a fixup in the first place then. For most shaders
you'll end up only checking a bitmask against zero in that case. Note
that with a 16bit encoding you'd still cut the 64 bytes the array
currently uses in half, in which case the structure would easily fit
into a typical cacheline, if that's even the issue.

> Quicktime uses YUV formats in d3d9. It doesn't do any D3D geometry drawing 
> with it(no windows driver supports that it seems), just blitting. I don't 
> want to be able to do D3D drawing with YUV textures, but I'd like to be able 
> to use the D3D shader conversion code with blitting should that become 
> necessary.
>
Fair enough.

Reply via email to