16:14, 16 сентября 2021 г., "Roger Pau Monné" <[email protected]>:

On Thu, Sep 16, 2021 at 02:30:39PM +0200, Jan Beulich wrote:

 On 16.09.2021 13:10, Dmitry Isaikin wrote:
 > From: Dmitry Isaykin <[email protected]>
 >
 > This significantly speeds up concurrent destruction of multiple domains on x86.
 
 This effectively is a simplistic revert of 228ab9992ffb ("domctl:
 improve locking during domain destruction"). There it was found to
 actually improve things; now you're claiming the opposite. It'll
 take more justification, clearly identifying that you actually
 revert an earlier change, and an explanation why then you don't
 revert that change altogether. You will want to specifically also
 consider the cleaning up of huge VMs, where use of the (global)
 domctl lock may hamper progress of other (parallel) operations on
 the system.
 
 > I identify the place taking the most time:
 >
 > do_domctl(case XEN_DOMCTL_destroydomain)
 > -> domain_kill()
 > -> domain_relinquish_resources()
 > -> relinquish_memory(d, &d->page_list, PGT_l4_page_table)
 >
 > My reference setup: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, Xen 4.14.
 >
 > I use this command for test:
 >
 > for i in $(seq 1 5) ; do xl destroy test-vm-${i} & done
 >
 > Without holding the lock all calls of `relinquish_memory(d, &d->page_list, PGT_l4_page_table)`
 > took on my setup (for HVM with 2GB of memory) about 3 seconds for each destroying domain.
 >
 > With holding the lock it took only 100 ms.
 
 I'm further afraid I can't make the connection. Do you have an
 explanation for why there would be such a massive difference?
 What would prevent progress of relinquish_memory() with the
 domctl lock not held?


I would recommend to Dmitry to use lock profiling with and without
this change applied and try to spot which lock is causing the
contention as a starting point. That should be fairly easy and could
share some light.

Regards, Roger.

Thanks. I will try.

Dmitry.

Reply via email to