On Thu, 21 Aug 2025 16:02:09 +0100 Steven Price <[email protected]> wrote:
> On 21/08/2025 12:51, Boris Brezillon wrote: > > On Wed, 16 Jul 2025 16:43:24 +0100 > > Steven Price <[email protected]> wrote: > [...] > >> Although in general I'm a bit wary of relying on the whole lock region > >> feature - previous GPUs have an errata. But maybe I'm being over > >> cautious there. > > > > We're heavily relying on it already to allow updates of the VM while > > the GPU is executing stuff. If that's problematic on v10+, I'd rather > > know early :D. > > I think I'm just scarred by my experiences over a decade ago... ;) > > I'm not aware of any issues with the modern[1] GPUs. The issue used to > be that the lock region could get accidentally unlocked by a cache flush > from another source - specifically the cache flush on job start flag. > > It's also not a major issue if you keep the page tables consistent, the > lock region in theory allows a region to be in an inconsistent state - > but generally there's no need for that. AFAIK we mostly keep the tables > consistent anyway. Right, it's not a problem until we introduce sparse binding support, at which point atomicity becomes important, and given remapping is not a thing the io-pagetable layer provides (remap has to be unmap+map), I need to rely on region locking to make it work, or we'll have to eat the fault-but-not-really-because-its-being-remapped overhead/complexity. Honestly, I'd rather rely on region locking if it's working, because it's far simpler ;-).
