Re: [PATCH 0/3] migration: Downtime tracepoints

Peter Xu Mon, 30 Oct 2023 08:15:14 -0700

On Fri, Oct 27, 2023 at 11:17:41PM +0100, Joao Martins wrote:
> On 27/10/2023 15:41, Peter Xu wrote:
> > On Fri, Oct 27, 2023 at 09:58:03AM +0100, Joao Martins wrote:
> >> On 26/10/2023 21:07, Peter Xu wrote:
> >>> On Thu, Oct 26, 2023 at 08:33:13PM +0100, Joao Martins wrote:
> >>>> Sure. For the fourth patch, feel free to add Suggested-by and/or a Link,
> >>>> considering it started on the other patches (if you also agree it is 
> >>>> right). The
> >>>> patches ofc are enterily different, but at least I like to believe the 
> >>>> ideas
> >>>> initially presented and then subsequently improved are what lead to the 
> >>>> downtime
> >>>> observability improvements in this series.
> >>>
> >>> Sure, I'll add that.
> >>>
> >>> If you like, I would be definitely happy to have Co-developed-by: with 
> >>> you,
> >>> if you agree. 
> >>
> >> Oh, that's great, thanks!
> > 
> > Great!  I apologize on not asking already before a formal patch is post.
> > 
> >>
> >>> I just don't know whether that addressed all your need, and
> >>> I need some patch like that for our builds.
> >>
> >> I think it achieves the same as the other series. Or rather it 
> >> re-implements it
> >> but with less compromise on QAPI and made the tracepoints more 'generic' 
> >> to even
> >> other usecases and less specific to the 'checkpoint breakdown'. Which 
> >> makes the
> >> implementation simpler (like we don't need that array storing the 
> >> checkpoint
> >> timestamps) given that it's just tracing and not for QAPI.
> > 
> > Yes.  Please also feel free to have a closer look on the exact checkpoints
> > in that patch.  I just want to make sure that'll be able to service the
> > same as the patch you proposed, but with tracepoints, and I don't miss
> > anything important.
> > 
> > The dest checkpoints are all new, I hope I nailed them all right as we
> > would expect.
> > 
> Yeah, it looks like so; but I am no master of postcopy so I don't have quite 
> the
> cirurgical eye there.
> 
> Perhaps for the loadvm side, 'precopy-bh-enter' suggests some ambiguity (given
> that precopy happens on both source and destination?). Perhaps being explicit 
> in
> either incoming-bh-enter? or even prefix checkpoint names by 'source' on 
> source
> side for example to distinguish?


I don't think src has a bottom half; if not considering cleanup_bh as part
of migration at all.  Destination's BH is a major part of migration.

In all cases, let me prefix them with "src"/"dst" then, hopefully even clearer.

> 
> > For src checkpoints, IIRC your patch explicitly tracked return path closing
> > while patch 4 only made it just part of final enclosure; the 4th checkpoint
> > is after non-iterable sent, until 5th to be the last "downtime-end". It can
> > cover more than "return path close":
> > 
> >     qemu_savevm_state_complete_precopy_non_iterable <--- 4th checkpoint
> >     qemu_fflush (after non-iterable sent)
> >     close_return_path_on_source
> >     cpu_throttle_stop
> >     qemu_mutex_lock_iothread
> >     migration_downtime_end                          <---- 5th checkpoint
> > 
> > If you see fit or necessary, I can, for example, also add one more
> > checkpoint right after close return path.  I just didn't know whether it's
> > your intention to explicitly check that point.  Just let me know if so.
> > 
> It was put there because it was a simple catch-all from the source PoV on how
> much the destination is taking. Of course bearing in mind that without
> return-path then such metric/tracepoint is not available. On the other hand, 
> with
> migration_downtime_end I think we have the equivalent in that resume 
> checkpoint.
> So we just need to make the diff between that and the non-iterable and I think
> we have the same things as I was looking for (even more I would say).
> 
> > Also on whether you prefer to keep a timestamp in the tracepoint itself;
> > I only use either "log" or "dtrace"+qemu-trace-stap for tracing: both of
> > them contain timestamps already.  But I can also add the timestamps
> > (offseted by downtime_start) if you prefer.
> > 
> Perhaps it is easy to wrap the checkpoint tracepoint in its own function to
> allow extension of something else e.g. add the timestamp (or any other data 
> into
> the checkpoints) or do something in a particular checkpoint of the migration.
> The timing function from qemu would give qemu own idea of time (directly
> correlable with the downtime metric that applications consume) which would be
> more consistent? though I am at too minds on this whether to rely on tracing
> stamps or align with what's provided to applications.

Yes it should be more consistent.  Let me add them into the checkpoints.

Thanks,

-- 
Peter Xu

Re: [PATCH 0/3] migration: Downtime tracepoints

Reply via email to