Am 10.03.2026 um 14:32 hat Markus Armbruster geschrieben:
> This is now commit 544ddbb6373d61292a0e2dc269809cd6bd5edec6.  I'm not
> objecting.  Just a few remarks.  I dropped qemu-stable@ from cc.
> 
> Kevin Wolf <[email protected]> writes:
> 
> > Commit 2155d2dd introduced rate limiting for BLOCK_IO_ERROR to emit an
> > event only once a second. This makes sense for cases in which the guest
> > keeps running and can submit more requests that would possibly also fail
> > because there is a problem with the backend.
> >
> > However, if the error policy is configured so that the VM is stopped on
> > errors, this is both unnecessary because stopping the VM means that the
> > guest can't issue more requests and in fact harmful because stopping the
> > VM is an important state change that management tools need to keep track
> > of even if it happens more than once in a given second. If an event is
> > dropped, the management tool would see a VM randomly going to paused
> > state without an associated error, so it has a hard time deciding how to
> > handle the situation.
> >
> > This patch disables rate limiting for action=stop by not relying on the
> > event type alone any more in monitor_qapi_event_queue_no_reenter(), but
> > checking action for BLOCK_IO_ERROR, too. If the error is reported to the
> > guest or ignored, the rate limiting stays in place.
> >
> > Fixes: 2155d2dd7f73 ('block-backend: per-device throttling of 
> > BLOCK_IO_ERROR reports')
> > Signed-off-by: Kevin Wolf <[email protected]>
> > ---
> >  qapi/block-core.json |  2 +-
> >  monitor/monitor.c    | 21 ++++++++++++++++++++-
> >  2 files changed, 21 insertions(+), 2 deletions(-)
> >
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index b66bf316e2f..da0b36a3751 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -5794,7 +5794,7 @@
> >  # .. note:: If action is "stop", a `STOP` event will eventually follow
> >  #    the `BLOCK_IO_ERROR` event.
> >  #
> > -# .. note:: This event is rate-limited.
> > +# .. note:: This event is rate-limited, except if action is "stop".
> >  #
> >  # Since: 0.13
> >  #
> > diff --git a/monitor/monitor.c b/monitor/monitor.c
> > index 1273eb72605..37fa674cfe6 100644
> > --- a/monitor/monitor.c
> > +++ b/monitor/monitor.c
> > @@ -367,14 +367,33 @@ monitor_qapi_event_queue_no_reenter(QAPIEvent event, 
> > QDict *qdict)
> >  {
> >      MonitorQAPIEventConf *evconf;
> >      MonitorQAPIEventState *evstate;
> > +    bool throttled;
> >  
> >      assert(event < QAPI_EVENT__MAX);
> >      evconf = &monitor_qapi_event_conf[event];
> >      trace_monitor_protocol_event_queue(event, qdict, evconf->rate);
> > +    throttled = evconf->rate;
> > +
> > +    /*
> > +     * Rate limit BLOCK_IO_ERROR only for action != "stop".
> > +     *
> > +     * If the VM is stopped after an I/O error, this is important 
> > information
> > +     * for the management tool to keep track of the state of QEMU and we 
> > can't
> > +     * merge any events. At the same time, stopping the VM means that the 
> > guest
> > +     * can't send additional requests and the number of events is already
> > +     * limited, so we can do without rate limiting.
> > +     */
> > +    if (event == QAPI_EVENT_BLOCK_IO_ERROR) {
> > +        QDict *data = qobject_to(QDict, qdict_get(qdict, "data"));
> > +        const char *action = qdict_get_str(data, "action");
> > +        if (!strcmp(action, "stop")) {
> > +            throttled = false;
> > +        }
> > +    }
> 
> Having event-specific logic in the general event emission function is
> ugly.
> 
> Before the patch, the "throttle this event?" logic is coded in one
> place: table monitor_qapi_event_conf[].
> 
> The table maps from event kind (enum QAPIEvent) to the minimum time
> between two events.  Non-zero specifies a rate limit, zero makes it
> unlimited.
> 
> Aside: as far as I can tell, we've only ever used one
> MonitorQAPIEventConf: { .rate = 1000 * SCALE_MS }.  Could be dumbed down
> to bool.
> 
> This is insufficient for QAPIEvent QAPI_EVENT_BLOCK_IO_ERROR, where the
> desired rate depends on event data, not just the QAPIEvent.
> 
> The patch gives us the desired rate, but it splits the logic between the
> map and the map's user.
> 
> I think the cleaner solution is to make the map more capable: have it
> maps from the entire event, not just its kind.
> 
> An obvious way to do that would be a table of function pointers
> 
>     bool (*)(QAPIEvent, QDict *)
> 
> Null means unlimited.
> 
> The table entry for BLOCK_IO_ERROR returns false for unlimited when
> "action" is "stop", else true.
> 
> The table entries for the other rate-limited events simply return true.
> 
> Thoughts?

Like you, I often prefer data to code. However, if the data becomes just
a table of function pointers to trivial functions, wouldn't it be better
to just have it as code in the first place?

It could be a helper function with the same signature as you proposed
above and it would simply be a switch statement returning true for a few
event types, looking at the QDict for BLOCK_IO_ERROR and returning false
for default. Seems a lot simpler than an explicit table of function
pointers.

Kevin


Reply via email to