On Fri, May 22, 2020 at 11:14:24AM +0100, Igor Druzhinin wrote:
> On 22/05/2020 11:08, Roger Pau Monné wrote:
> > On Thu, May 21, 2020 at 10:43:58PM +0100, Igor Druzhinin wrote:
> >> If a recalculation NPT fault hasn't been handled explicitly in
> >> hvm_hap_nested_page_fault() then it's potentially safe to retry -
> >> US bit has been re-instated in PTE and any real fault would be correctly
> >> re-raised next time.
> >>
> >> This covers a specific case of migration with vGPU assigned on AMD:
> >> global log-dirty is enabled and causes immediate recalculation NPT
> >> fault in MMIO area upon access. This type of fault isn't described
> >> explicitly in hvm_hap_nested_page_fault (this isn't called on
> >> EPT misconfig exit on Intel) which results in domain crash.
> > 
> > Couldn't direct MMIO regions be handled like other types of memory for
> > the purposes of logdiry mode?
> > 
> > I assume there's already a path here used for other memory types when
> > logdirty is turned on, and hence would seem better to just make direct
> > MMIO regions also use that path?
> 
> The proble of handling only MMIO case is that the issue still stays.
> It will be hit with some other memory type since it's not MMIO specific.
> The issue is that if global recalculation is called, the next hit to
> this type will cause a transient fault which will not be handled
> correctly after a due fixup by neither of our handlers.

I admit I should go look at the code, but for example RAM p2m types
don't require this fix, so I assume there's some different path taken
in that case that avoids all this?

Ie: when global logdirty is enabled you will start to get nested page
faults for every access, yet only direct MMIO types require this fix?

Thanks, Roger

Reply via email to