On Fri, Apr 04, 2003 at 08:48:35AM -0700, Brian Paul wrote:
> In general, this sounds reasonable but you also have to consider
> performance.
> The glVertex, Color, TexCoord, etc commands have to be simple and fast. As
> it is now, glColor4f (for example) (when implemented in X86 assembly) is
> just a jump into _tnl_Color4f() which stuffs the color into the immediate
> struct and returns. Something similar is done in the R200 driver.
>
> If the implementation of _tnl_Color4f() involves a call to
> producer->Color4f() we'd lose some performance.
I know, but my objective is to design a good object interface on which
all drivers may fit and reuse code. When a driver gets to the point
where the producer->Color4F() calls are the main performance bottleneck
(!?) the developer is free to write a tailored version of TnLProducer
that elimates that extra call:
class TnLProducerFast {
Vertex current;
TnLConsumer *consumer;
TnLProducer(TnLConsumer *_consumer) {
consumer=_consumer;
}
void activate() {
_glapi_setapi(GL_COLOR3f, _Color3f)
...
}
static _Color3f(r, g, b) {
TnLProducer *self = GET_THIS_PTR_FROM_CURRENT_CTX();
self->current.r = r; self->current.g = g; self->current.b = b;
}
};
We can even generate automatically this TnLProducerFast from the
original TnLProducer with a template, i.e.,
template < class T >
class TnLProducerTmpl {
T tnl;
void activate() {
_glapi_setapi(GL_COLOR3f, _Color3f)
...
}
static _Color3f(r, g, b) {
TnLProducerTmpl *self = GET_THIS_PTR_FROM_CURRENT_CTX();
self->tnl.Color3f(r, g, b); // This call is eliminated if T::Color3f
// is inlined
}
}
typedef TnLProducerTmpl< TnLProducer > TnLProducerFast;
But this is all of _very_ _little_ importance when compared by the
ability of _writing_ a full driver fast, which is given by a well
designed OOP interface. As I said here several times, this kind of
low-level optimizations consume too much development time causing that
higher-level optimizations (usually with much more impact on
performance) are never attempted.
> Nowadays, vertex arrays are the path to use if you really care about
> performance, of course, but a lot of apps still use the regular
> per-vertex GL functions.
Now that you mention vertex array, for that, the producer would be
different, but the consumer would be the same.
Jos� Fonseca
-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel