On 09.10.25 22:09, Raphael Norwitz wrote:
A few suggestions here. Overall, it looks sane to me.

First my applogizes, I should have said it earlier:

I'm preparing a v2, and starting from this patch it's significantly
reworked (the previous big part of refactoring (01-23,25) is still
relevant)

I have a parallel series for similar migration of virtio-net/tap
(TAP device fds are migrated through UNIX socket), and there were
a lot of discussions, and the ideas applies to vhost-user-blk series
as well.

The main change of v2 is significantly simplified interface:
the whole feature is enabled/disable by one migration parameter,
no need for per-device options. But this requires additional
changes in code, as we have to postpone backend (chardev opening
and initial communication to vhost-server) until the point in time
when we know, are we going to get the fds from migration channel
or not.

Next, migration part was revorked into VMSD structures instead of
.save() / .load() handlers.

Now, my work is to look at the comments and understand, how
much they apply to upcoming v2.


On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
<[email protected]> wrote:

[..]


@@ -1624,21 +1652,30 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
      r = vhost_set_backend_type(hdev, backend_type);
      assert(r >= 0);

-    r = hdev->vhost_ops->vhost_backend_init(hdev, opaque, errp);
-    if (r < 0) {
-        goto fail;
+    if (hdev->migrating_backend) {
+        /* backend must support detached state */

Probably better to error_report() or something other than a raw assert?

Assert is better, as this is not possible. Still, no such handlers in v2.


+        assert(hdev->vhost_ops->vhost_save_backend);
+        assert(hdev->vhost_ops->vhost_load_backend);
+        hdev->_features_wait_incoming = true;
      }

-    r = hdev->vhost_ops->vhost_set_owner(hdev);
+    r = hdev->vhost_ops->vhost_backend_init(hdev, opaque, errp);
      if (r < 0) {
-        error_setg_errno(errp, -r, "vhost_set_owner failed");

[..]

@@ -1920,6 +1960,12 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, 
const int *feature_bits,
                              uint64_t features)
  {
      const int *bit = feature_bits;
+

Should this be

     if (hdev->_features_wait_incoming && hdev->migrating_backend) {

to not impact existing flows?

This code is still in v2.

But _features_wait_incoming is a new field introduced withi this commit,
so there are no existing flows with it..

And in v2 _features_wait_incoming and migrating_backend are less
connected. Initialization code in v2 doesn't rely on .migrating_backend
(as we just don't know :). stop()/start() code will still rely on
.migrating_backend.


+    if (hdev->_features_wait_incoming) {
+        /* Excessive set is enough for early initialization. */
+        return features;
+    }
+
      while (*bit != VHOST_INVALID_FEATURE_BIT) {
          uint64_t bit_mask = (1ULL << *bit);
          if (!vhost_dev_has_feature(hdev, *bit)) {
@@ -1930,6 +1976,66 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, 
const int *feature_bits,
      return features;
  }


[..]

@@ -2204,14 +2317,29 @@ static int do_vhost_dev_stop(struct vhost_dev *hdev, 
VirtIODevice *vdev,
      event_notifier_cleanup(
          &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);

+    if (hdev->migrating_backend) {

Ditto - no raw assert()?


no handlers - no problmes (in v2 :). Still, I'm sure that assert was good here, 
as we never
set migrating_backend for devices which don't support it.



+        /* backend must support detached state */
+        assert(hdev->vhost_ops->vhost_save_backend);
+        assert(hdev->vhost_ops->vhost_load_backend);
+    }
+
      trace_vhost_dev_stop(hdev, vdev->name, vrings);

      if (hdev->vhost_ops->vhost_dev_start) {
          hdev->vhost_ops->vhost_dev_start(hdev, false);
      }
-    if (vrings) {
+    if (vrings && !hdev->migrating_backend) {
          vhost_dev_set_vring_enable(hdev, false);
      }
+
+    if (hdev->migrating_backend) {
+        for (i = 0; i < hdev->nvqs; ++i) {
+            struct vhost_virtqueue *vq = hdev->vqs + i;
+
+            event_notifier_set_handler(&vq->error_notifier, NULL);
+        }
+    }
+
      for (i = 0; i < hdev->nvqs; ++i) {
          rc |= do_vhost_virtqueue_stop(hdev,
                                        vdev,



--
Best regards,
Vladimir

Reply via email to