On Tue, Jul 25, 2023 at 9:53 AM Hanna Czenczek <[email protected]> wrote:
>
> On 24.07.23 17:48, Eugenio Perez Martin wrote:
> > On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <[email protected]> wrote:
> >> On 21.07.23 17:25, Eugenio Perez Martin wrote:
> >>> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <[email protected]> wrote:
> >>>> Move the `suspended` field from vhost_vdpa into the global vhost_dev
> >>>> struct, so vhost_dev_stop() can check whether the back-end has been
> >>>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`. If it has,
> >>>> there is no need to reset it; the reset is just a fall-back to stop
> >>>> device operations for back-ends that do not support suspend.
> >>>>
> >>>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
> >>>> when the device is re-started, we still have to do the reset to have it
> >>>> un-suspend.
> >>>>
> >>>> Signed-off-by: Hanna Czenczek <[email protected]>
> >>>> ---
> >>>> include/hw/virtio/vhost-vdpa.h | 2 --
> >>>> include/hw/virtio/vhost.h | 8 ++++++++
> >>>> hw/virtio/vhost-vdpa.c | 11 +++++++----
> >>>> hw/virtio/vhost.c | 8 +++++++-
> >>>> 4 files changed, 22 insertions(+), 7 deletions(-)
> >>>>
> >>>> diff --git a/include/hw/virtio/vhost-vdpa.h
> >>>> b/include/hw/virtio/vhost-vdpa.h
> >>>> index e64bfc7f98..72c3686b7f 100644
> >>>> --- a/include/hw/virtio/vhost-vdpa.h
> >>>> +++ b/include/hw/virtio/vhost-vdpa.h
> >>>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
> >>>> bool shadow_vqs_enabled;
> >>>> /* Vdpa must send shadow addresses as IOTLB key for data queues,
> >>>> not GPA */
> >>>> bool shadow_data;
> >>>> - /* Device suspended successfully */
> >>>> - bool suspended;
> >>>> /* IOVA mapping used by the Shadow Virtqueue */
> >>>> VhostIOVATree *iova_tree;
> >>>> GPtrArray *shadow_vqs;
> >>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >>>> index 6a173cb9fa..69bf59d630 100644
> >>>> --- a/include/hw/virtio/vhost.h
> >>>> +++ b/include/hw/virtio/vhost.h
> >>>> @@ -120,6 +120,14 @@ struct vhost_dev {
> >>>> uint64_t backend_cap;
> >>>> /* @started: is the vhost device started? */
> >>>> bool started;
> >>>> + /**
> >>>> + * @suspended: Whether the vhost device is currently suspended. Set
> >>>> + * and reset by implementations (vhost-user, vhost-vdpa, ...), which
> >>>> + * are supposed to automatically suspend/resume in their
> >>>> + * vhost_dev_start handlers as required. Must also be cleared when
> >>>> + * the device is reset.
> >>>> + */
> >>>> + bool suspended;
> >>>> bool log_enabled;
> >>>> uint64_t log_size;
> >>>> Error *migration_blocker;
> >>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>>> index 7b7dee468e..f7fd19a203 100644
> >>>> --- a/hw/virtio/vhost-vdpa.c
> >>>> +++ b/hw/virtio/vhost-vdpa.c
> >>>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct
> >>>> vhost_dev *dev,
> >>>>
> >>>> static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> >>>> {
> >>>> - struct vhost_vdpa *v = dev->opaque;
> >>>> int ret;
> >>>> uint8_t status = 0;
> >>>>
> >>>> ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> >>>> trace_vhost_vdpa_reset_device(dev);
> >>>> - v->suspended = false;
> >>>> + dev->suspended = false;
> >>>> return ret;
> >>>> }
> >>>>
> >>>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev
> >>>> *dev)
> >>>> if (unlikely(r)) {
> >>>> error_report("Cannot suspend: %s(%d)", g_strerror(errno),
> >>>> errno);
> >>>> } else {
> >>>> - v->suspended = true;
> >>>> + dev->suspended = true;
> >>>> return;
> >>>> }
> >>>> }
> >>>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev
> >>>> *dev, bool started)
> >>>> return -1;
> >>>> }
> >>>> vhost_vdpa_set_vring_ready(dev);
> >>>> + if (dev->suspended) {
> >>>> + /* TODO: When RESUME is available, use it instead of
> >>>> resetting */
> >>>> + vhost_vdpa_reset_status(dev);
> >>> How is that we reset the status at each vhost_vdpa_dev_start? That
> >>> will clean all the vqs configured, features negotiated, etc. in the
> >>> vDPA device. Or am I missing something?
> >> What alternative do you propose? We don’t have RESUME for vDPA in qemu,
> >> but we somehow need to lift the previous SUSPEND so the device will
> >> again respond to guest requests, do we not?
> >>
> > Reset also clears the suspend state in vDPA, and it should be called
> > at vhost_dev_stop. So the device should never be in suspended state
> > here. Does that solve your concerns?
>
> My intention with this patch was precisely not to reset in
> vhost_dev_stop when suspending is supported. So now I’m more confused
> than before.
>
At this moment, I think that should be focused as an optimization and
not to be included in the main series.
> >> But more generally, is this any different from what is done before this
> >> patch? Before this patch, vhost_dev_stop() unconditionally invokes
> >> vhost_reset_status(), so the device is reset in every stop/start cycle,
> >> that doesn’t change. And we still won’t reset it on the first
> >> vhost_dev_start(), because dev->suspended will be false then, only on
> >> subsequent stop/start cycles, as before. So the only difference is that
> >> now the device is reset on start, not on stop.
> >>
> > The difference is that vhost_vdpa_dev_start is called after features
> > ack (via vhost_dev_start, through vhost_dev_set_features call) and vq
> > configuration (using vhost_virtqueue_start). A device reset forces the
> > device to forget about all of that, and qemu cannot configure them
> > again until qemu acks the features again.
>
> Now I’m completely confused, because I don’t see the point of
> implementing suspend at all if we rely on a reset immediately afterwards
> anyway.
It's not immediate. From vhost_dev_stop, comments added only in this mail:
void vhost_virtqueue_stop(struct vhost_dev *dev,
struct VirtIODevice *vdev,
struct vhost_virtqueue *vq,
unsigned idx)
{
...
// Get each vring indexes, trusting the destination device can
continue safely from there
r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
if (r < 0) {
VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
/* Connection to the backend is broken, so let's sync internal
* last avail idx to the device used idx.
*/
virtio_queue_restore_last_avail_idx(vdev, idx);
} else {
virtio_queue_set_last_avail_idx(vdev, idx, state.num);
}
...
}
void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
{
...
// Suspend the device, so we can trust in vring indexes / vq state
if (hdev->vhost_ops->vhost_dev_start) {
hdev->vhost_ops->vhost_dev_start(hdev, false);
}
if (vrings) {
vhost_dev_set_vring_enable(hdev, false);
}
// Fetch each vq index / state and store in vdev->vq[i]
for (i = 0; i < hdev->nvqs; ++i) {
vhost_virtqueue_stop(hdev,
vdev,
hdev->vqs + i,
hdev->vq_index + i);
}
// Reset the device, as we don't need it anymore and it can
release the resources
if (hdev->vhost_ops->vhost_reset_status) {
hdev->vhost_ops->vhost_reset_status(hdev);
}
}
---
> It was my impression this whole time that suspending would
> remove the need to reset. Well, at least until the device should be
> resumed again, i.e. in vhost_dev_start().
>
It cannot. vhost_dev_stop is also called when the guest reset the
device, so the guest trusts the device to be in a clean state.
Also, the reset is the moment when the device frees the unused
resources. This would mandate the device to
> In addition, I also don’t understand the magnitude of the problem with
> ordering. If the order in vhost_dev_start() is wrong, can we not easily
> fix it?
The order in vhost_dev_start follows the VirtIO standard. The move of
the reset should be to remove it from vhost_dev_stop to something like
both virtio_reset and the end of virtio_save. I'm not sure if I'm
forgetting some other use cases.
> E.g. add a full vhost_dev_resume callback to invoke right at
> the start of vhost_dev_start(); or check (in the same place) whether the
> back-end supports resuming, and if it doesn’t (and it is currently
> suspended), reset it there.
>
> In any case, if we need to reset in vhost_dev_stop(), i.e. immediately
> after suspend, I don’t see the point of suspending, indicating to me
> that I still fail to understand its purpose.
>
> Hanna
>