Zhang Chen <zhangc...@gmail.com> writes: > From: zhanghailiang <zhang.zhanghaili...@huawei.com> > > If some errors happen during VM's COLO FT stage, it's important to > notify the users of this event. Together with 'x-colo-lost-heartbeat', > Users can intervene in COLO's failover work immediately. > If users don't want to get involved in COLO's failover verdict, > it is still necessary to notify users that we exited COLO mode. > > Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> > Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> > Signed-off-by: Zhang Chen <zhangc...@gmail.com> > Reviewed-by: Eric Blake <ebl...@redhat.com> > --- > migration/colo.c | 20 ++++++++++++++++++++ > qapi/migration.json | 37 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 57 insertions(+) > > diff --git a/migration/colo.c b/migration/colo.c > index c083d36..8ca6381 100644 > --- a/migration/colo.c > +++ b/migration/colo.c > @@ -28,6 +28,7 @@ > #include "net/colo-compare.h" > #include "net/colo.h" > #include "block/block.h" > +#include "qapi/qapi-events-migration.h" > > static bool vmstate_loading; > static Notifier packets_compare_notifier; > @@ -514,6 +515,18 @@ out: > qemu_fclose(fb); > } > > + /* > + * There are only two reasons we can go here, some error happened. > + * Or the user triggered failover. > + */ > + if (failover_get_state() == FAILOVER_STATUS_NONE) { > + qapi_event_send_colo_exit(COLO_MODE_PRIMARY, > + COLO_EXIT_REASON_ERROR, NULL); > + } else { > + qapi_event_send_colo_exit(COLO_MODE_PRIMARY, > + COLO_EXIT_REASON_REQUEST, NULL); > + }
Your comment makes me suspect failover_get_state() can only be FAILOVER_STATUS_NONE or FAILOVER_STATUS_REQUIRE here. Is that correct? If yes, I recommend to add a suitable assertion. > + > /* Hope this not to be too long to wait here */ > qemu_sem_wait(&s->colo_exit_sem); > qemu_sem_destroy(&s->colo_exit_sem); > @@ -744,6 +757,13 @@ out: > if (local_err) { > error_report_err(local_err); > } > + if (failover_get_state() == FAILOVER_STATUS_NONE) { > + qapi_event_send_colo_exit(COLO_MODE_SECONDARY, > + COLO_EXIT_REASON_ERROR, NULL); > + } else { > + qapi_event_send_colo_exit(COLO_MODE_SECONDARY, > + COLO_EXIT_REASON_REQUEST, NULL); > + } Same question. > > if (fb) { > qemu_fclose(fb); > diff --git a/qapi/migration.json b/qapi/migration.json > index f3974c6..55dae48 100644 > --- a/qapi/migration.json > +++ b/qapi/migration.json > @@ -875,6 +875,43 @@ > 'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] } > > ## > +# @COLO_EXIT: > +# > +# Emitted when VM finishes COLO mode due to some errors happening or > +# at the request of users. > +# > +# @mode: report COLO mode when COLO exited. > +# > +# @reason: describes the reason for the COLO exit. > +# > +# Since: 2.13 > +# > +# Example: > +# > +# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172}, > +# "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" > } } > +# > +## > +{ 'event': 'COLO_EXIT', > + 'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } } 'data' duplicates the next patch's ColoStatus, except it lacks @colo-running. Factoring out the common part doesn't seem worth the bother. Okay as is. > + > +## > +# @COLOExitReason: > +# > +# The reason for a COLO exit > +# > +# @none: no failover has ever happened. This can't occur in the COLO_EXIT event, only in the result of query-colo-status, can it? Worth spelling that out in the documentation? > +# > +# @request: COLO exit is due to an external request > +# > +# @error: COLO exit is due to an internal error > +# > +# Since: 2.13 > +## > +{ 'enum': 'COLOExitReason', > + 'data': [ 'none', 'request', 'error' ] } > + > +## > # @x-colo-lost-heartbeat: > # > # Tell qemu that heartbeat is lost, request it to do takeover procedures.