On 28/11/2018 00:05, Andrew Cooper wrote:
> On 27/11/2018 19:40, Julien Grall wrote:
>> (+ Stefano)
>>
>> On 11/27/18 5:12 PM, Volodymyr Babchuk wrote:
>>> Hello community,
>> Hi Volodymyr,
>>
>>> After creating domU, I'm seeing lots of this messages from hypervisor:
>>>
>>> (XEN) p2m.c:1442: d1v0: gvirt_to_maddr failed va=0xffff80000efc7f0f
>>> flags=0x1 par=0x809
>>> (XEN) p2m.c:1442: d1v0: gvirt_to_maddr failed va=0xffff80000efc7f00
>>> flags=0x1 par=0x809
>>> (XEN) p2m.c:1442: d1v0: gvirt_to_maddr failed va=0xffff80000efc7f0f
>>> flags=0x1 par=0x809
>>>
>>> Interestingly, I'm getting them from both Dom0 and DomU:
>>>
>>> (XEN) p2m.c:1442: d0v0: gvirt_to_maddr failed va=0xffff80003efd7f0f
>>> flags=0x1 par=0x809
>>> (XEN) p2m.c:1442: d1v0: gvirt_to_maddr failed va=0xffff80000efc7f0f
>>> flags=0x1 par=0x809
>>>
>>> But only after DomU is created.
>>>
>>> I attached GDB and found that this is caused by update_runstate_area:
>>>
>>> (gdb) bt
>>> #0  get_page_from_gva (v=0x80005dbe2000, v@entry=0x22f2c8
>>> <schedule+1236>,
>>>      va=va@entry=18446603337277996815, flags=flags@entry=1) at
>>> p2m.c:1440
>>> #1  0x000000000024e320 in translate_get_page (write=true, linear=true,
>>> addr=18446603337277996815,
>>>      info=...) at guestcopy.c:37
>>> #2  copy_guest (buf=buf@entry=0x80005dbe20d7,
>>> addr=addr@entry=18446603337277996815, len=len@entry=1,
>>>      info=..., flags=flags@entry=6) at guestcopy.c:69
>>> #3  0x000000000024e45c in raw_copy_to_guest
>>> (to=to@entry=0xffff80003efd7f0f,
>>>      from=from@entry=0x80005dbe20d7, len=len@entry=1) at guestcopy.c:110
>>> #4  0x00000000002497b4 in update_runstate_area
>>> (v=v@entry=0x80005dbe2000) at domain.c:287
>>> #5  0x0000000000249eb8 in context_switch
>>> (prev=prev@entry=0x80005dbe2000,
>>>      next=next@entry=0x80005bf3c000) at domain.c:344
>>> #6  0x000000000022f2c8 in schedule () at schedule.c:1583
>>> #7  0x0000000000232c10 in __do_softirq
>>> (ignore_mask=ignore_mask@entry=0) at softirq.c:50
>>> #8  0x0000000000232ca4 in do_softirq () at softirq.c:64
>>> #9  0x0000000000258254 in leave_hypervisor_tail () at traps.c:2302
>>>
>>> This issue is encountered on QEMU-ARMv8. Dom0 kernel is Linux 4.19.0
>>> My XEN master is at d8ffac1f7 "xen/arm: gic: Remove duplicated comment
>>> in do_sgi"
>>>
>>> The same setup worked perfectly with Xen 4.10.2
>> The message is only printed in debug build. Do you have CONFIG_DEBUG
>> enabled?
>>
>>> Julien, I saw on mailing list, that you paid attention to issues with
>>> gvirt_to_maddr,
>>> so maybe you can be interested in this.
>> Which thread are you speaking about? The problem is not because of
>> gvirt_to_maddr but of how update_runstate_area is working at the moment.
>>
>> update_runstate_area is using a guest virtual address to update the
>> vCPU runstate. It blindly assumes the vCPU runstate will always be
>> mapped in stage-1 page-tables. However, if KPTI (Kernel Page Table
>> Isolation) is enabled the kernel address space (and therefore the vCPU
>> runstate) will not be mapped when running at EL0.
>>
>> So if you are restoring a vCPU that was executing code at EL0 then
>> update_runstate_area will fail as the address is not mapped. There are
>> a few solution suggested on the ML (see [1]). However I haven't had
>> time to look at properly how to implement them.
>>
>> KPTI is getting used more widely (e.g meltdown and KASLR). So it would
>> be good if we try to solve this problem sooner. I would be happy to
>> review patches and/or provide advice if you want to tackle the problem.
>>
>> Cheers,
>>
>> [1] https://lists.xen.org/archives/html/xen-devel/2018-03/msg00223.html
>>
> update_runstate_area() using a virtual address is a complete misfeature,
> and the sooner we can replace it, the better.  It's history is with x86
> PV guests, where the early ABIs were designed in terms of Linux's
> copy_{to,from}_user().
>
> It is similarly broken in x86 with meltdown mitigations, as well as SMAP
> considerations (PAN in ARM, iirc).
>
> We've got two options.  Invent a new API which takes a gfn/gaddr, or
> retrofit the API to be "you pass a virtual address, we translate to
> gfn/gaddr, then update that".  Perhaps both.
>
> When this was last discussed, I think the "onetime translate to
> gfn/gaddr" was a good enough compatibility to cope with existing guests,
> but that we should have a more clean way for modern guests.

Or alternatively, see if we can actually get away without it.  A lot of
the early Xen paravirtual functionality can probably be done without, or
designed in a better way entirely.

~Andrew

_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to