Dave Airlie wrote: >> At least for TTM this is part of a larger problem where you can hit >> problems both when the pinned page quota is hit, and when >> you can't fit an object in the aperture. >> >> The other problem is the one you mention. Since we're dealing with >> multiple clients and only evict one buffer at a time at aperture >> space-shortage and even may have pinned buffers scattered in the >> aperture, there is a probability that the execbuf call will fail with >> -ENOMEM. I guess before doing that, the kernel could retry and evict all >> evictable buffers before starting validation. That would eliminate all >> fragmentation issues except those arising from pinned buffers. >> >> > > IMHO with a complete kernel driver we can avoid fragmentation issues with > a cost, i.e. if only the kernel can pin buffers (scanout/cursor etc) we > should be able to fence and move them by having special pinned move > handlers that would only be used in extreme situations, these handlers > would know how to turn cursors off and even move the display base address, > it may have to flicker the screen but really anything is better than > failing due to fragged memory. > I agree. It's probably possible to come up with a clever scheme for this, and even to update the display base address during vblank. User space doesn't need to care, since the virtual address will stay the same, but in the end we need to do something about locking in the fb layer. We need to be able to modify the kernel virtual address and GPU offset while fb is running.
>> The problem remains how to avoid this situation completely. I guess the >> drm driver can reserve a global "safe" aperture size, and communicate >> that to the 3D client, but the current TTM drivers don't deal with this >> situation. >> My first idea would probably be your first alternative. Flush and re-do >> the state-emit if the combined buffer size is larger than the "safe" >> aperture size. >> > > I think a dynamically sized safe aperture size that can be used per batch > submission, is probably the best plan, this might also allow throttling in > multi-app situations to help avoid thrashing, by reducing the per-app > limits. For cards with per-process we could make it the size of the > per-process aperture. > Actually, thrashing TT memory shouldn't be that horribly bad, as there is generally no caching attribute flipping going on, but it will temporarily stall the driver from working ahead with a new batch and thus drain the pipeline. Thrashing will go on anyway in the multi-app situation, since the driver needs to throttle due to an aperture space shortage, but it will be more driver-induced and perhaps a bit more efficient. > The case where an app manages to submit a working set for a single > operation that is larger than the GPU can deal with, should be considered > a bug in the driver I suppose. > Yes, I agree, but we must make sure the kernel can _really_ honor the advertized working set size, because otherwise it's an OOM situation we can't recover from other than by perhaps skipping a frame. This is increasingly important with binning hardware that likes to submit a whole scene in a single batch, but OTOH they usually have a very large aperture / GPU virtual space. > Dave. > /Thomas ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone -- _______________________________________________ Dri-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dri-devel
