Keith Whitwell wrote:

Thomas Hellstr�m wrote:

Keith Whitwell wrote:

Thomas Hellstr�m wrote:

Hi!

Keith Whitwell wrote:


get lock while (timestamp mismatch) { release lock request new cliprects and timestamp get lock }

Note that is the contended case only. What's the worst that could happen - somebody's wizzing windows around and our 3d client sits in this loop for the duration. Note that the loop includes X server communication so it's not going to suck up the cpu or anything drastic.





This is basically what I'm doing right now. The problem is as the code continues:



get lock while (timestamp mismatch) { release lock request new cliprects and timestamp get lock } wait_for_device() render_to_scale_buffer() wait_for_device() render_to_back_buffer() wiat_for_device() blit_to_screen() release_lock()

And, to avoid holding the lock while waiting for the device, since that blocks use of the decoder while I'm doing scaling operations, I'd like to

mark_scaling_device_busy()
get_drawable_lock()
get lock
while (timestamp mismatch) {
   release lock
   release_drawable_lock()
   request new cliprects and timestamp
   get_drawable_lock
   get lock
 }
release_lock()
wait_for_device()
get_lock()
render_to_scale_buffer()
release_lock()
wait_for_device()
get_lock()
render_to_back_buffer()
release_lock()
wait_for_device()
get_lock()
blit_to_screen()
release_lock()
mark_scaling_device_free()




And then release_drawable_lock()?

What semantics are you hoping for from the drawable lock in your scenario above? Just that the cliprects won't change while it is held?



Exactly on both points, except the drawable_lock would have to be released before mark_scaling_device_free() to avoid deadlocks.



So a few more questions:

1) Why (exactly) is keeping the cliprects from changing a concern? What happens if they change between steps above?

Not much really. It all boils down to what to do if the per-drawable back-buffer mismatches the drawable. In the simplest case one would simply skip the blit, which might even be better than an attempt to match the old backbuffer to a new drawable size. Problems really only occur if / when the drawable is resized. But I've considered this and since it's a simple, still working but not perfect solution, I'm still considering it.



2) Could the DDX driver blit the contents of these additional buffers (scale, back) at the same time it blits the frontbuffer so that the window change "just works"?


You mean the front blitting during window moves? In this case it doesn't relly apply, since the per-drawable back buffer would still be valid. Resizing would be the only operation causing problems.

3) I don't think that the drawable lock is a pretty thing, is it worth keeping it around for this? Would some black areas or incomplete video frames during window moves be so bad? Note that the next version of this hardware might have a proper command stream that just allows you to submit all those operations to hardware in a single go, and not have to do the waiting in the driver...

I can see your point, but on the other hand if the command stream were that smart, it would only in effect implement a continously held heavyweight lock, blocking all dma submissions to the mpeg decoder and 2D / 3D engine while the scaling engine is working, which is exactly what I'm trying to avoid.

The problem is really what to do when there are a lot of independent engines on a video chip, with a common command stream, numerous IRQ sources and one global hardware lock. I assume this will be more of a problem in the future. The solution using the drawable lock is not very clean. On the other hand, not being able no to use the engines in parallel is not very efficient and is bad for interactivity.

I'm not sure what's the best design to solve this, but one idea would be having a futex-like lock and a "breadcrumb pool" for each engine, optionally also with an IRQ. This would be sufficient to

   * be able to independently submit DMA commands.
   * wait for engine idle independently on each engine without ever
     needing to wait for DMA quiescent.
   * hold the global hardware lock only during operations that render
     directly to the front buffer, or to a common back-buffer. The
     global lock would then effectively be a "drawable" lock.
   * Keep backwards compatibility, as simple architectures may choose
     to retain only the global lock.

Hmm, maybe for now I'll stick to the simple solution, :)

But I think a design that works around the single-lock-and-command-stream-multiple-engines would bee needed in the not too far future.

/Thomas


Keith





------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click -- _______________________________________________ Dri-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to