Re: [Dri-devel] Smoother graphics with 16bpp on radeon

magenta Fri, 06 Dec 2002 14:53:49 -0800

On Fri, Dec 06, 2002 at 09:26:49AM -0800, Ian Romanick wrote:
> On Thu, Dec 05, 2002 at 04:40:00PM -0800, magenta wrote:
> > On Thu, Dec 05, 2002 at 03:56:09PM -0800, Ian Romanick wrote:
> > > 
> > > It is one way.  It's the way that the OpenGL ARB has sanctified with an
> > > extension.  It's not the only way.  On Windows with a Radeon, for example,
> > > if you click the 'FSAA 2x' box, it will tell the driver to render to a
> > > buffer twice as wide as requested and scale down when it does a blit for the
> > > swap-buffer call.  'FSAA 4x' does 2x width & 2x height.  This is called
> > > super-sampling.
> > 
> > My understanding was that the ARB_multisample extension could be
> > implemented using supersampling (even if it's not actually done using
> > multisampling), and that enabling ARB_multisample was functionally
> > equivalent to clicking the FSAA checkbox in the driver.  If that's not the
> > case, then that's been the source of my confusion all along.
> 
> It may be true for cards that support ARB_multisample.  However, according
> to ATI, only the Radeon 9500 / 9700 support that extension.  All of the
> other cards user supersampling, for which there is no extension.  If
> ARB_multisample could be implemented somehow using supersampling, I would
> think that at least the 8500 / 9000 would support it.


IMO, the drivers for the hardware which support supersampling but not
multisampling should just use the multisample extension to mean "let's
supersample," since IMO supersampling is just a special case of
multisampling anyway (where the multisample buffer always covers the entire
screen).  Either that or a separate supersample extension should be
drafted.  Supporting FSAA in the Windows way (having the user click a
checkbox in the driver configuration) is IMO very kludgy and
un-OpenGL-esque.

> > But how could it be wrong in such a way that some other choice could be
> > right?  I mean, if the application sends the vertex array to it in a
> > certain format, either the card can support that format or it can't and the
> > driver has to convert it, right?  So is it just a matter of which
> > conversion is least-sucky?
> 
> I guess "wrong" was a poor word choice.  I didn't mean wrongs as in
> producing incorrect results.  I meant wrong as in having sub-optimal
> performance.  My problem with coming up with examples here is that nobody
> has done enough detailed performance testing of DRI drivers with enough apps
> to determine what addition fast-paths might be needed.  Right now, for the
> most part, there is one path.

Right, and that's how I was using the term "wrong" as well.  The point I'm
making is that I don't see why the driver would need to be configured to
support a different subset of what's a fast path in hardware to begin with.
I mean, okay, I can see really funky edge cases due to different quirky
pipeline stalls and so on, but I'd think those would only cause a problem
when sending a large number of very small vertex lists to the card, and
performance is going to suck no matter what in that case.

> > Wouldn't vertex/fragment programs already be using the card's native
> > format(s) though?  Once the client state is all configured and that
> > glDrawElements() call happens, wouldn't the driver have to either decide
> > that the format is something the hardware supports, or convert it into
> > something which it does?
> 
> The case I was thinking of for vertex / fragment programs was in the
> compilation of the programs to native code.  It would be akin to selecting
> specific optimization flags to pass to gcc.  Just saying '-O2 -march=i686'
> doesn't always produce the best results.  Quite often you want to go in and
> set very specific optimiation flags.  There's no way to do this (and nor
> should there be!!!) in any of the vertex / fragment program specs.
> 
> A good example is (will be?) NV30 (and perhaps NV20, but I'm not 100%
> positive).  That hardware supports a reduced precision floating point
> format.  This is a register format (like using 16-bit x86 registers vs.
> 32-bit x86 registers).   For some calculations (i.e., those involving
> color), using the fp-16 doesn't make any difference in the output and
> improves performance.  Their compiler makes some fairly conservative
> assumptions about when it can use this format.  For some apps, it might be a
> performance win to tell it to relax those assumptions a little bit.

Okay, I see what you're saying.  I don't see why this can't be hinted at by
the application, though.  Like, glHint(GL_VERTEX_PROGRAM, GL_FASTEST) would
always use the lower-precision data format (yes, I realize this could cause
"incorrect" results, but in a game which would set that, it's unlikely that
the one or two pixel difference in a projected vertex coordinate will
matter :)

> If an app uses ARB_{vertex,fragment}_program, there is no way for it to tell
> the GL about that.  Even if it were possible, it would be in the same
> situation as with anisotropic filtering:  how can the app support the "next"
> thing that comes along?

The next version of the software.  Current versions would be supported with
a tweak wrapper library. :)

-- 
http://trikuare.cx


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Smoother graphics with 16bpp on radeon

Reply via email to