On Fri, Apr 04, 2003 at 08:48:35AM -0700, Brian Paul wrote:
> In general, this sounds reasonable but you also have to consider 
> performance.
> The glVertex, Color, TexCoord, etc commands have to be simple and fast.  As 
> it is now, glColor4f (for example) (when implemented in X86 assembly) is 
> just a jump into _tnl_Color4f() which stuffs the color into the immediate 
> struct and returns.  Something similar is done in the R200 driver.
> 
> If the implementation of _tnl_Color4f() involves a call to 
> producer->Color4f() we'd lose some performance.

I know, but my objective is to design a good object interface on which
all drivers may fit and reuse code. When a driver gets to the point
where the producer->Color4F() calls are the main performance bottleneck
(!?) the developer is free to write a tailored version of TnLProducer
that elimates that extra call:

class TnLProducerFast {

  Vertex current;
  TnLConsumer *consumer;
  
  TnLProducer(TnLConsumer *_consumer) {
    consumer=_consumer;
  }

  void activate() {
     _glapi_setapi(GL_COLOR3f, _Color3f)
     ...
  }
  
  static _Color3f(r, g, b) {
    TnLProducer *self = GET_THIS_PTR_FROM_CURRENT_CTX();
    self->current.r = r; self->current.g = g; self->current.b = b;
  }
  
};

We can even generate automatically this TnLProducerFast from the
original TnLProducer with a template, i.e.,

template < class T > 
class TnLProducerTmpl {

  T tnl;

  void activate() {
     _glapi_setapi(GL_COLOR3f, _Color3f)
     ...
  }
  
  static _Color3f(r, g, b) {
    TnLProducerTmpl *self = GET_THIS_PTR_FROM_CURRENT_CTX();
    self->tnl.Color3f(r, g, b); // This call is eliminated if T::Color3f
                                // is inlined
  }
}

typedef TnLProducerTmpl< TnLProducer > TnLProducerFast;

But this is all of _very_ _little_ importance when compared by the
ability of _writing_ a full driver fast, which is given by a well
designed OOP interface. As I said here several times, this kind of
low-level optimizations consume too much development time causing that
higher-level optimizations (usually with much more impact on
performance) are never attempted.

> Nowadays, vertex arrays are the path to use if you really care about
> performance, of course, but a lot of apps still use the regular
> per-vertex GL functions.

Now that you mention vertex array, for that, the producer would be
different, but the consumer would be the same.

Jos� Fonseca


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb: 
Dedicated Hosting for just $79/mo with 500 GB of bandwidth! 
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to