Markus Armbruster <[email protected]> writes:

> Avihai Horon <[email protected]> writes:
>
>> The VFIO_MIGRATION event notifies users when a VFIO device transitions
>> to a new state.
>>
>> One use case for this event is to prevent timeouts for RDMA connections
>> to the migrated device. In this case, an external management application
>> (not libvirt) consumes the events and disables the RDMA timeout
>> mechanism when receiving the event for PRE_COPY_P2P state, which
>> indicates that the device is non-responsive.
>>
>> This is essential because RDMA connections typically have very low
>> timeouts (tens of milliseconds), which can be far below migration
>> downtime.
>>
>> However, under heavy resource utilization, the device transition to
>> PRE_COPY_P2P can take hundreds of milliseconds to complete. Since the
>> VFIO_MIGRATION event is currently sent only after the transition
>> completes, it arrives too late, after RDMA connections have already
>> timed out.
>>
>> To address this, send an additional "prepare" event immediately before
>> initiating the PRE_COPY_P2P transition. This guarantees timely event
>> delivery regardless of how long the actual state transition takes.
>>
>> Signed-off-by: Avihai Horon <[email protected]>
>
> [...]
>
>> diff --git a/qapi/vfio.json b/qapi/vfio.json
>> index a1a9c5b673..17b6046871 100644
>> --- a/qapi/vfio.json
>> +++ b/qapi/vfio.json
>> @@ -11,7 +11,13 @@
>>  ##
>>  # @QapiVfioMigrationState:
>>  #
>> -# An enumeration of the VFIO device migration states.
>> +# An enumeration of the VFIO device migration states.  In addition to
>> +# the regular states, there are prepare states (with 'prepare' suffix)
>> +# which indicate that the device is just about to transition to the
>> +# corresponding state.  Note that seeing a prepare state for state X
>> +# doesn't guarantee that the next state will be X, as the state
>> +# transition can fail and the device may transition to a different
>> +# state instead.
>>  #
>>  # @stop: The device is stopped.
>>  #
>> @@ -32,11 +38,14 @@
>>  #     tracking its internal state and its internal state is available
>>  #     for reading.
>>  #
>> +# @pre-copy-p2p-prepare: The device is just about to move to
>> +#     pre-copy-p2p state.  (since 11.0)
>> +#
>>  # Since: 9.1
>>  ##
>>  { 'enum': 'QapiVfioMigrationState',
>>    'data': [ 'stop', 'running', 'stop-copy', 'resuming', 'running-p2p',
>> -            'pre-copy', 'pre-copy-p2p' ] }
>> +            'pre-copy', 'pre-copy-p2p', 'pre-copy-p2p-prepare' ] }
>>  
>>  ##
>>  # @VFIO_MIGRATION:
>
> Acked-by: Markus Armbruster <[email protected]>

Except for the subject line: "vfio/migration: Send VFIO_MIGRATION event
before PRE_COPY_P2P transition" become misleading in v2.

>
> [...]


Reply via email to