On 11/7/19 11:47 AM, Jan Beulich wrote: > On 07.11.2019 12:35, George Dunlap wrote: >> On 11/6/19 3:19 PM, Jan Beulich wrote: >>> In order for individual IOMMU drivers (and from an abstract pov also >>> architectures) to be able to adjust their data structures ahead of time >>> when they might cover only a sub-range of all possible GFNs, introduce >>> a notification call used by various code paths potentially installing a >>> fresh mapping of a never used GFN (for a particular domain). >> >> So trying to reverse engineer what's going on here, you mean to say >> something like this: >> >> --- >> Individual IOMMU drivers contain adjuct data structures for gfn ranges >> contained in the main p2m. For efficiency, these adjuct data structures >> often cover only a subset of the gfn range. Installing a fresh mapping >> of a never-used gfn may require these ranges to be expanded. Doing this >> when the p2m entry is first updated may be problematic because <reasons>. >> >> To fix this, implement notify_gfn(), to be called when Xen first becomes >> aware that a potentially new gfn may be about to be used. This will >> notify the IOMMU driver about the new gfn, allowing it to expand the >> data structures. It may return -ERESTART (?) for long-running >> operations, in which case the operation should be restarted or a >> different error if the expansion of the data structure is not possible. >> In the latter case, the entire operation should fail. >> --- >> >> Is that about right? > > With the exception of the -ERESTART / long running operations aspect, > yes. Plus assuming you mean "adjunct" (not "adjuct", which my > dictionary doesn't know about). > >> Note I've had to make a lot of guesses here about >> the functionality and intent. > > Well, even after seeing your longer description, I don't see what mine > doesn't say
* "Ahead of time" -- ahead of what? * Why do things need to be done ahead of time, rather than at the time (for whatever it is)? (I couldn't even really guess at this, which is why I put "<reasons>".) * To me "notify" doesn't in any way imply that the operation can fail. Most modern notifications are FYI only, with no opportunity to prevent the thing from happening. (That's not to say that notify is an inappropriate name -- just that by itself it doesn't imply the ability to cancel, which seems like a major factor to understanding the intent of the patch.) >>> Note that in gnttab_transfer() the notification and lock re-acquire >>> handling is best effort only (the guest may not be able to make use of >>> the new page in case of failure, but that's in line with the lack of a >>> return value check of guest_physmap_add_page() itself). >> >> Is there a reason we can't just return an error to the caller? > > Rolling back what has been done by that point would seem rather > difficult, which I guess is the reason why the code was written the > way it is (prior to my change). The phrasing made me think that you were changing it to be best-effort, rather than following suit with existing functionality. Maybe: "Note that before this patch, in gnttab_transfer(), once <condition> happens, further errors modifying the physmap are ignored (presumably because it would be too complicated to try to roll back at that point). This patch follows suit by ignoring failed notify_gfn()s, simply printing out a warning that the gfn may not be accessible due to the failure." -George _______________________________________________ Xen-devel mailing list [email protected] https://lists.xenproject.org/mailman/listinfo/xen-devel
