On Fri, 5 Jul 2019 14:56:23 +1000 David Gibson <da...@gibson.dropbear.id.au> wrote:
> On Thu, Jul 04, 2019 at 10:12:04AM +0200, Greg Kurz wrote: > > On Thu, 4 Jul 2019 10:23:57 +1000 > > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > > > On Wed, Jul 03, 2019 at 07:50:12PM +0200, Greg Kurz wrote: > > > > ics_set_kvm_state_one() is called either during reset, in which case > > > > both 'saved priority' and 'current priority' are equal to 0xff, or > > > > during migration. In the latter case, 'saved priority' may differ > > > > from 'current priority' only if the interrupt had been masked with > > > > the ibm,int-off RTAS call. Instead of aborting QEMU, print out an > > > > error and exit. > > > > > > What's the rationale for this? Doesn't hitting this indicate an error > > > in the qemu code, for which an abort is the usual response? > > > > > > > This error can be hit by the destination during migration if the > > incoming stream is corrupted. Aborting in this case would mislead > > the user into suspecting a bug in the destination QEMU, which isn't > > the case. > > Rather than a bug in the source qemu? I guess so. > A bug in the source QEMU for live migration or a corrupted snapshot for load_vm, which could result from a qcow2 file corruption for example. > > Appart from that, when the in-kernel XICS is in use, only two functions > > manipulate the ICS state: ics_set_kvm_state_one() and ics_get_kvm_state(). > > The code is trivial enough that I don't see a great value in the assert > > in the first place... BTW, it comes from the commit: > > > > commit 11ad93f68195f68cc94d988f2aa50b4d190ee52a > > Author: David Gibson <da...@gibson.dropbear.id.au> > > Date: Thu Sep 26 16:18:44 2013 +1000 > > > > xics-kvm: Support for in-kernel XICS interrupt controller > > > > Maybe you remember some context that justified the assert at the > > time ? > > It was probably mostly about documenting the invariants that are > supposed to apply here. > Indeed this error on the reset path is very likely a bug in QEMU, and the assert() makes sense in this case. I'm convinced by the documenting argument. Please forget this patch :) > > > > > > > > > > Based-on: <156217454083.559957.7359208229523652842.st...@bahia.lan> > > > > Signed-off-by: Greg Kurz <gr...@kaod.org> > > > > --- > > > > > > > > This isn't a bugfix, hence targetting 4.2, but it depends on an actual > > > > fix for 4.1, as mentionned in the Based-on tag. > > > > --- > > > > hw/intc/xics_kvm.c | 17 +++++++++++++++-- > > > > 1 file changed, 15 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c > > > > index 2df1f3e92c7e..f8758b928250 100644 > > > > --- a/hw/intc/xics_kvm.c > > > > +++ b/hw/intc/xics_kvm.c > > > > @@ -255,8 +255,21 @@ int ics_set_kvm_state_one(ICSState *ics, int > > > > srcno, Error **errp) > > > > state = irq->server; > > > > state |= (uint64_t)(irq->saved_priority & KVM_XICS_PRIORITY_MASK) > > > > << KVM_XICS_PRIORITY_SHIFT; > > > > - if (irq->priority != irq->saved_priority) { > > > > - assert(irq->priority == 0xff); > > > > + > > > > + /* > > > > + * An interrupt can be masked either because the ICS is resetting, > > > > in > > > > + * which case we expect 'current priority' and 'saved priority' to > > > > be > > > > + * equal to 0xff, or because the guest has called the ibm,int-off > > > > RTAS > > > > + * call, in which case we we have recorded the priority the > > > > interrupt > > > > + * had before it was masked in 'saved priority'. If the interrupt > > > > isn't > > > > + * masked, 'saved priority' and 'current priority' are equal (see > > > > + * ics_get_kvm_state()). Make sure we restore a sane state, > > > > otherwise > > > > + * fail migration. > > > > + */ > > > > + if (irq->priority != irq->saved_priority && irq->priority != 0xff) > > > > { > > > > + error_setg(errp, "Corrupted state detected for interrupt > > > > source %d", > > > > + srcno); > > > > + return -EINVAL; > > > > } > > > > > > > > if (irq->priority == 0xff) { > > > > > > > > > >