Re: [PATCH v3 00/24] Multifd 🔀 device state transfer support with VFIO consumer

Peter Xu Thu, 12 Dec 2024 09:55:23 -0800

On Wed, Dec 11, 2024 at 12:06:04AM +0100, Maciej S. Szmigiero wrote:
> On 6.12.2024 23:20, Peter Xu wrote:
> > On Fri, Dec 06, 2024 at 07:03:36PM +0100, Maciej S. Szmigiero wrote:
> > > On 4.12.2024 20:10, Peter Xu wrote:
> > > > On Sun, Nov 17, 2024 at 08:19:55PM +0100, Maciej S. Szmigiero wrote:
> > > > > Important note:
> > > > > 4 VF benchmarks were done with commit 5504a8126115
> > > > > ("KVM: Dynamic sized kvm memslots array") and its revert-dependencies
> > > > > reverted since this seems to improve performance in this VM config if 
> > > > > the
> > > > > multifd transfer is enabled: the downtime performance with this commit
> > > > > present is 1141 ms enabled / 1730 ms disabled.
> > > > > 
> > > > > Smaller VF counts actually do seem to benefit from this commit, so 
> > > > > it's
> > > > > likely that in the future adding some kind of a memslot pre-allocation
> > > > > bit stream message might make sense to avoid this downtime regression 
> > > > > for
> > > > > 4 VF configs (and likely higher VF count too).
> > > > 
> > > > I'm confused why revert 5504a8126115 could be faster, and it affects as
> > > > much as 600ms.  Also how that effect differs can relevant to num of VFs.
> > > > 
> > > > Could you share more on this regression?  Because if that's problematic 
> > > > we
> > > > need to fix it, or upstream QEMU (after this series merged) will still 
> > > > not
> > > > work.
> > > > 
> > > 
> > > The number of memslots that the VM uses seems to differ depending on its
> > > VF count, each VF using 2 memslots:
> > > 2 VFs, used slots: 13
> > > 4 VFs, used slots: 17
> > > 5 VFs, used slots: 19
> > 
> > It's still pretty less.
> > 
> > > 
> > > So I suspect this performance difference is due to these higher counts
> > > of memslots possibly benefiting from being preallocated on the previous
> > > QEMU code (before commit 5504a8126115).
> > > 
> > > I can see that with this commit:
> > > > #define  KVM_MEMSLOTS_NR_ALLOC_DEFAULT                      16
> > > 
> > > So it would explain why the difference is visible on 4 VFs only (and
> > > possibly higher VF counts, just I don't have an ability to test migrating
> > > it) since with 4 VF configs we exceed KVM_MEMSLOTS_NR_ALLOC_DEFAULT.
> > 
> > I suppose it means kvm_slots_grow() is called once, but I don't understand
> > why it caused 500ms downtime!
> 
> In this cover letter sentence:
> > "the downtime performance with this commit present is 1141 ms enabled / 
> > 1730 ms disabled"
> "enabled" and "disabled" refer to *multifd transfer* being enabled, not
> your patch being present (sorry for not being 100% clear there).
> 
> So the difference that the memslot patch makes is 1141 ms - 1095ms = 46 ms 
> extra
> downtime, not 500 ms.
> 
> I can guess this is because of extra contention on BQL, with unfortunate 
> timing.


Hmm, I wonder why the address space changed during switchover.  I was
expecting the whole address space is updated on qemu boots up, and should
keep as is during migration.  Especially why that matters with mulitifd at
all..  I could have overlooked something.

> 
> > Not to mention, that patchset should at least reduce downtime OTOH due to
> > the small num of slots, because some of the dirty sync / clear path would
> > need to walk the whole slot array (our lookup is pretty slow for now, but
> > probably no good reason to rework it yet if it's mostly 10-20).
> 
> With multifd transfer being disabled your memslot patch indeed improves the
> downtime by 1900 ms - 1730 ms = 170 ms.

That's probably the other side of the change when slots grow, comparing to
the pure win where the series definitely should speedup the dirty track
operations quite a bit.

> 
> > In general, I would still expect that dynamic memslot work to speedup
> > (instead of slowing down) VFIO migrations.
> > 
> > There's something off here, or something I overlooked.  I suggest we figure
> > it out..  Even if we need to revert the kvm series on master, but I so far
> > doubt it.
> > 
> > Otherwise we should at least report the number with things on the master
> > branch, and we evaluate merging this series with that real number, because
> > fundamentally that's the numbers people will get when start using this
> > feature on master later.
> 
> Sure, that's why in the cover letter I provided the numbers with your commit
> present, too.

It seems to me we're not far away from the truth.  Anyway, feel free to
update if you figure out the reason, or got some news on profiling.

Thanks,

-- 
Peter Xu

Re: [PATCH v3 00/24] Multifd 🔀 device state transfer support with VFIO consumer

Reply via email to