Hi All,
PFA the notes (credit - Samuel) as below.
Kind regards,
Ayan
Problem :- Bugs in HW speculative execution on modern processors. Leak
info from one guest to another, from hyp to malicious guest. Some
mitigation taken in Xen, but have performance penalty. There is also
core scheduling in Xen. We run Xen on VM exit. Xen try to map minimum
information in page tables. If some point, Xen needs to access sensitive
info, it switches to new set of page tables which maps all memory. It
should use aggressive techniques for security.
The approach inspired by the linux kernel.
Paths used from guests are a limited set of what the hypervisor can
access. Minimal pagetables for guests, with less mitigations needed
thanks to this (why?).
Bertrand - Introduce Kernel page table isolation (KPTI) is already in Xen
Jurgen - AWS has submitted some patches in Xen in this regard. It
applies to x86, may applies to Arm. Not sure of the current state of
series.
Andrei - Does it related to KVM ? They published some set of patches.
Jurgen - Kernel page table isolation is different to Xen.
Andrei - Split the info into globally confidential, not confidential.
There are some hooks triggered by this transition. KVM patches may be
adapted by Xen. Don’t know Amazion patches.
Jurgen - Who was driving at Amazon ?
Bertrand - Mapping the whole memory in Xen was a concern at Arm. KVM
security patches need to be interesting, but attack surface differs
between Xen and KVM. We should not repeat what is done by KVM
Jurgen - Need to show the mitigation is better than current approach.
Mappin all the memory might be interesting. But sometimes we don not
have enough VA space.
Roger - posted a linked from Microsoft research explaining the work had
been done. “Rethinking isolation in the age of speculation”. With KVM
and Microsoft’s work, it may be interesting.
Bertrand - Should find a way not to map everything in Xen. No one
opposes that in Xen community.
Andrei - Check first the ongoing work in Xen community.
Bertrand - Should do the patches with x86 and Arm in mind.
Jurgen - Agreed. We should do for both architectures. Hypervisor
resources needed for guest should be kept in mind. Per guest mapping
should help here.
Bertrand - If we know what memory is mapped for each guest, it should help.
Andrei - Is it a Arm feature ?
Bertrand - Most of the data is stored in guest. A guest should not
starve Xen so other guests are not affected. All the data pertaining to
a guest should reside in the guest (not in Xen).
Andrei - Permissive mode should help ?
Bertrand - Per guest heap should make security easier. If we have guest
heap mapped and Xen heap mapped, that should be preliminary work.
Jurgen - For sys call in linux kernel, map little stuff in the beginning
and map things in page table on demands. It is still is a idea phase,
don’t know performance impacts. This was discussed in 2018 in Intel summit.
Andrei - We should gather this techniques in a series of patches
Bertrand - If on Arm when mapping on demand, it will generate exception
and passes control to Xen.
Jurgen - This should be common across all arch. Exception handler should
be handled in virtual mode.
The idea is to map all the code initially, and this will reduce
complexity. Need sys call handler spec and page table
Bertrand - You will generate exception , then it can generate Spectre /
Meltdown issues
Jurgen - Need to consider flushing buffers in and out of exception.
Mapping buffers will add penalty.
Bertrand - Spectre / Meltdown mitigation made system slower
Jurgen - On x86, a simple solution is to run without cache.
Bertrand - There are some hooks in ATF. The hyp may call the firmware.
Some CPU have specific info to flush cache. Like turn on off in MMU, and
it impacts performance. In some CPU, the mitigation is done in hardware.
In Arm, the mitigation techniques differ from CPU to CPU.
Mitigation (mapping + page exception) will make Xen complex. Like we
need to flush TLB when going to gu8est.
Jurgen - in x86, when one core enters hyp mode, other core should also
enter. Need IPIs.
Bertrand - Need to underatdn how mitigation applies to Xen. Our surface
attack in smaller. I don’t know if the mitigation techniques solve anything
Jurgen - Mitigation is a nice conceptual project, but don’t practical.
Bertrand - This should be research project in a uni. No immediate
requirement.
Andrei - What happens on KVM ? Does mitigation lead to better performance ?
Bertrand - Security issues are not discussed widely in Arm
Jurgen - Should be able to find a KVM engineer to give us performance data.
Bertrand - Sometime performance impact may be case specific
Jurgen - We do thorough performance measurement. In one patch series,
our performance engineers have analysed thoroughly
Olivier - We have to continue to investigate in community. We should
create a epic for this.
Bertrand - Epic 1 - Reduce the system memory mapped in Xen. All agreed
(Jurgen - the patch series by Amazon is floating). We need to prioritise
this and do it across Arm and x86
Epic 2 - Per guest resource mapping. Understand performance impact.
This concept applies when core scheduling is applicable. Isolate guest
on specific code, then some issues may decrease
Jurgen - Core scheduling depends on cache hierarchy. It is configurable
in theory. Has a performance impact,
Bertrand - On Arm, the impact may be CPU pipeline dependent. There is no
easy answer. Sometimes some core are not affected.
There is a NUMA series on Arm sent by Wei. Question to Wei - If the core
(which core on same socket/platform) topology is available on Xen now or
after NUMA series ?
Wei - On NUMA support, we have a way to determine which core belongs to
which resource. We cannot distinguish on hyperscalar. Can’t distinguish
between logical / physical core.
Bertrand - There is only one core with hyperscaliong available (A65). We
should check this.
Jurgen - Let’s see how the ongoing project gets done. We need to invest
some work to measure the upsides and downsides. The approach looks
promising but needs to be tested. It's a very nice project but this
*might* be a waste of time - and we don't know by advance.
Bertrand - It is quite some work to be done cleanly.