Since new driver interfaces have been brought up, here are some thoughts
about improving texturing:
With the current architecture, it isn't possible to accelerate
glCopyTexSubimage, even though most non-voodoo hardware is capable of doing
it completely asynhronously. The requirement of having an up-to-date copy
of the texture in main memory forces a hardware sync and software reading
of the frame buffer. There are many interesting rendering algorithms that
are only feasable with a full speed copytexsubimage.
Texsubimage (and
teximage, but that isn't really an issue) still operates at less than half
the potential speed. A max-performance implementation would read from the
client space and write (potentially 16 bit textures) directly to the card's
command buffers. The current implementation, even on the fast path, first
reads from the client and writes to the main memory texture, then reads
from there and writes to the command buffer. 100+ mb/sec texture downloads
should be possible through the standard api calls.
It is not uncommon to
have systems now with 16mb/64mb or 32mb/128mb of video memory vs system
memory. Especially when the card is storing 16 bit textures and 32 bit
mirrors of the textures are stored in main memory, it is obvious that there
is significant inefficiency. Avoiding the main memory copies of resident
textures would make a significant difference with Q3 on 64 mb systems.
All of these could be addressed by allowing Mesa to manage a texture object
without having the texels in main memory.
The memory savings and copytexsubimage features could be enabled with just
two additions to the current architecture: the driver would need a call
into mesa to have it free the texels for a given image (or it might just
free it itself and zero the pointer), and mesa would need another driver
call to have the driver fill in an image with a texture it is managing when
it is needed for a software path.
Getting the max speed texsubimage would need an additional driver entry
point, called after all the user parameters have been validated, but before
mesa tries to copy the texels into a main memory buffer. If the driver
claims it, mesa would skip the local update (the local image would have
been freed by the driver).
One possible objection to this type of arrangement is that a card with 16
bit textures would have permanently lost the low order bits of the texels
after upload, and any glGet on the texels would return the lower precision
values. I don't think this is non-conformant, but I would like to hear
other views on it.
Implementing these features would be fairly easy on the matrox driver (and
soon the rage pro driver). Interestingly, this type of architecture is not
possible on win9x, because parts of win9x can step on video memory and only
tell you about it after the fact, requiring you to always be able to
regenerate any needed textures from main memory.
John Carmack
_______________________________________________
Mesa-dev maillist - [EMAIL PROTECTED]
http://lists.mesa3d.org/mailman/listinfo/mesa-dev