Kristian Høgsberg wrote: > Thomas, > > What you describe sounds reasonable and I'll look into updating the > patch. I'm not too keen on the driver specific flag you suggest, > since it makes it hard to use the ioctl in generic code. Maybe > drivers that can do pageflip from one of several fifos can expose a > driver specific ioctl to configure which one should be used. I don't > imagine it's something that will change much? > > Kristian, Yes, that sounds OK with me. That last driver specific part was not very well thought through, I admit. There's probably other ways to do that as you suggest.
/Thomas > cheers, > Kristiain > > 2009/8/30 Thomas Hellström <[email protected]>: > >> Thomas Hellström skrev: >> >>>> I described this in more detail and hopefully more coherently in my >>>> email to Michel. If that's still not clear, follow up there. >>>> >>>> >>>> >>> I've read the mail and understand the proposal, thanks. >>> >>> /Thomas >>> >>> >>> >>> >> So, I've been doing some thinking over the weekend and here's a constructive >> proposal that hopefully can be the base of an agreement: >> >> 1) We require the semantics of the pageflip ioctl to be such that it is safe >> to schedule render commands that reference the buffers involved immediately >> after the ioctl return, or in other words, the pageflip has entered the >> graphics pipeline and any render operations to the referenced buffers will >> be guaranteed to be executed after the pageflip. How this is implemented is >> up to the driver, and thus the code will need driver-specific hooks. There >> is a multitude of ways to implement this, ranging from full hardware support >> to a real naive and stupid software implementation that blocks all command >> submission while there are pending flips. >> A simple sufficient implementation is to scan the buffers to be validated at >> cs time to see if there are pending pageflips on any of them. In that case, >> release all held cs locks and block. When the pageflip happens, continue. >> But again, that will be up to the driver. >> >> 2) We rip the blocking code out of the DRI2 protocol, since there is no >> longer any need for it. >> >> 3) The DRM event mechanism stays as proposed, The ioctl caller needs to >> explicitly request an event to get one. Events will initially be used by >> clients to wake _themselves_ from voluntary select() blocking. >> >> The motivation for 1-2 is as follows: >> a) The solution fits all kinds of smart hardware and cs mechanism, whereas >> the DRI2 blocking solution assumes a simple hardware and a naive kernel cs >> mechanism. One can argue that smart kernel schedulers or advanced hardware >> can work around the DRI2 blocking solution by sending out the event >> immediately, but there we are again working around the DRI2 blocking >> solution. We shouldn't need to do that. >> >> b) There's no requirement on masters to do scheduling with this proposal. >> Otherwise we'd have to live with that forever and implement it in all >> masters utilizing the pageflip ioctl. >> >> c) latency - performance. Consider the sequence of events between the vsync >> and the first set of rendering commands on the next frame being submit to >> the hardware: >> >> c1) DRI2 blocking: (Please correct me if I misunderstood something here) >> * vsync irq >> * schedule a wq thread that adds an event and wakes the X server. >> * X server issues a syscall to read the drm event >> * X server returns an event to the client with the new buffers. (write to >> socket ?) >> * Client reads the event (read from socket ?) >> * Client prepares the first command buffer (this is usually quite time >> consuming and in effect not only synchronizes GPU and command buffer >> building, but in effect serializes them). >> * Client builds and issues a cs ioctl. >> * Kernel submits commands to hardware. >> >> c2) Kernel scheduling (this proposal). >> * vsync irq >> * schedule the client thread that immediately submits commands to hardware. >> >> IMHO, c1 is far from optimal and should not be considered. One can argue >> once again, that this doesn't add much latency in practice, but we can't >> keep arguing like that for every such item we add per frame, and in this >> case the serialized command buffer building *does* add too much latency. We >> should seek and implement the optimal solution if it doesn't imply too much >> work or have side-effects. >> >> Some added functionality we should also perhaps consider adding to the ioctl >> interface: >> >> 1) A flag whether to vsync or not. Ideally a driconf option so that should >> perhaps be communicated as part of dri2 swapbuffers as well. I guess on >> Intel hardware you can only flip on vsync(?) but on some other hardware you >> can just send a new scanout startaddress down the FIFO. You'll definitely >> see tearing, though. >> >> 2) Driver private data. An example: Drivers with multiple hardware FIFOs >> that can do pageflipping and barriers on each FIFO might want to indicate to >> the kernel on which FIFO or HW context to schedule the pageflip. I guess >> this private data might also need to be passed along with dri2 swapbuffers. >> >> Thanks, >> /Thomas >> >> >> ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july -- _______________________________________________ Dri-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dri-devel
