On Fri, May 22, 2020 at 11:27:38AM +0100, Igor Druzhinin wrote:
> On 22/05/2020 11:23, Roger Pau Monné wrote:
> > On Fri, May 22, 2020 at 11:14:24AM +0100, Igor Druzhinin wrote:
> >> On 22/05/2020 11:08, Roger Pau Monné wrote:
> >>> On Thu, May 21, 2020 at 10:43:58PM +0100, Igor Druzhinin wrote:
> >>>> If a recalculation NPT fault hasn't been handled explicitly in
> >>>> hvm_hap_nested_page_fault() then it's potentially safe to retry -
> >>>> US bit has been re-instated in PTE and any real fault would be correctly
> >>>> re-raised next time.
> >>>>
> >>>> This covers a specific case of migration with vGPU assigned on AMD:
> >>>> global log-dirty is enabled and causes immediate recalculation NPT
> >>>> fault in MMIO area upon access. This type of fault isn't described
> >>>> explicitly in hvm_hap_nested_page_fault (this isn't called on
> >>>> EPT misconfig exit on Intel) which results in domain crash.
> >>>
> >>> Couldn't direct MMIO regions be handled like other types of memory for
> >>> the purposes of logdiry mode?
> >>>
> >>> I assume there's already a path here used for other memory types when
> >>> logdirty is turned on, and hence would seem better to just make direct
> >>> MMIO regions also use that path?
> >>
> >> The proble of handling only MMIO case is that the issue still stays.
> >> It will be hit with some other memory type since it's not MMIO specific.
> >> The issue is that if global recalculation is called, the next hit to
> >> this type will cause a transient fault which will not be handled
> >> correctly after a due fixup by neither of our handlers.
> > 
> > I admit I should go look at the code, but for example RAM p2m types
> > don't require this fix, so I assume there's some different path taken
> > in that case that avoids all this?
> > 
> > Ie: when global logdirty is enabled you will start to get nested page
> > faults for every access, yet only direct MMIO types require this fix?
> 
> It's not "only MMIO" - it's just MMIO area is hit in my particular case.
> I'd prefer this fix to address the general issue otherwise for SVM
> we would have to write handlers in hvm_hap_nested_page_fault() for
> every case as soon as we hit it.

Hm, I'm not sure I agree. p2m memory types are limited, and IMO we
want to have strict control about how they are handled.
hvm_hap_nested_page_fault is already full of special casing for each
memory type for that reason.

That being said, I also don't like the fact that logdity is handled
differently between EPT and NPT, as on EPT it's handled as a
misconfig while on NPT it's handled as a violation.

Thanks, Roger.

Reply via email to