On Mon, Mar 07, 2016 at 12:49:04PM -0700, Alex Williamson wrote: > On Mon, 29 Feb 2016 18:06:21 +1100 > David Gibson <[email protected]> wrote: > > > At present the code handling IBM's Enhanced Error Handling (EEH) interface > > on VFIO devices operates by bypassing the usual VFIO logic with > > vfio_container_ioctl(). That's a poorly designed interface with unclear > > semantics about exactly what can be operated on. > > > > In particular it operates on a single vfio container internally (hence the > > name), but takes an address space and group id, from which it deduces the > > container in a rather roundabout way. groupids are something that code > > outside vfio shouldn't even be aware of. > > > > This patch creates new interfaces for EEH operations. Internally we > > have vfio_eeh_container_op() which takes a VFIOContainer object > > directly. For external use we have vfio_eeh_as_ok() which determines > > if an AddressSpace is usable for EEH (at present this means it has a > > single container and at most a single group attached), and > > vfio_eeh_as_op() which will perform an operation on an AddressSpace in > > the unambiguous case, and otherwise returns an error. > > > > This interface still isn't great, but it's enough of an improvement to > > allow a number of cleanups in other places. > > > > Signed-off-by: David Gibson <[email protected]> > > --- > > hw/vfio/common.c | 84 > > ++++++++++++++++++++++++++++++++++++++++++++++++++ > > include/hw/vfio/vfio.h | 2 ++ > > 2 files changed, 86 insertions(+) > > > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > > index 607ec70..b61118e 100644 > > --- a/hw/vfio/common.c > > +++ b/hw/vfio/common.c > > @@ -1003,3 +1003,87 @@ int vfio_container_ioctl(AddressSpace *as, int32_t > > groupid, > > > > return vfio_container_do_ioctl(as, groupid, req, param); > > } > > + > > +/* > > + * Interfaces for IBM EEH (Enhanced Error Handling) > > + */ > > +static bool vfio_eeh_container_ok(VFIOContainer *container) > > +{ > > + /* A broken kernel implementation means EEH operations won't work > > + * correctly if there are multiple groups in a container. So > > + * return true only if there is exactly one group attached to the > > + * container */ > > Please don't add a new comment style to the file. What's broken about > the kernel implementation? It would be great if someone reading this > comment could understand the "why" rather than just the "what".
Ok, I've altered the style and expanded the details.
> > +
> > + if (QLIST_EMPTY(&container->group_list)) {
> > + return false;
> > + }
> > +
> > + if (QLIST_NEXT(QLIST_FIRST(&container->group_list), container_next)) {
> > + return false;
> > + }
> > +
> > + return true;
> > +}
> > +
> > +static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op)
> > +{
> > + struct vfio_eeh_pe_op pe_op = {
> > + .argsz = sizeof(pe_op),
> > + .op = op,
> > + };
> > + int ret;
> > +
> > + if (!vfio_eeh_container_ok(container)) {
> > + error_report("vfio/eeh: EEH_PE_OP 0x%x called on container"
> > + " with multiple groups", op);
>
> I know you rejected this before, but why is vfio_eeh_container_ok() not
> called vfio_eeh_singleton_container() since that's what it's checking
> for?
Because the intention is that when the kernel gets fixed, this will be
altered to succeed if we see whatever capability we use to advertise
the fixed kernel.
> This error should also say "not singleton", or something to that
> effect, since it might have failed for having no groups. The line wrap
> could also easily be done after the %x to make searching for the error
> string easier.
I've adjusted the message to address those points.
>
> > + return -EPERM;
> > + }
> > +
> > + ret = ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op);
> > + if (ret < 0) {
> > + error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op);
> > + return -errno;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static VFIOContainer *vfio_eeh_as_container(AddressSpace *as)
> > +{
> > + VFIOAddressSpace *space = vfio_get_address_space(as);
> > + VFIOContainer *container = NULL;
> > +
> > + if (QLIST_EMPTY(&space->containers)) {
> > + /* No containers to act on */
> > + goto out;
> > + }
> > +
> > + container = QLIST_FIRST(&space->containers);
> > +
> > + if (QLIST_NEXT(container, next)) {
> > + /* We don't yet have logic to synchronize EEH state across
> > + * multiple containers */
> > + container = NULL;
> > + goto out;
> > + }
> > +
> > +out:
> > + vfio_put_address_space(space);
> > + return container;
> > +}
> > +
> > +bool vfio_eeh_as_ok(AddressSpace *as)
> > +{
> > + VFIOContainer *container = vfio_eeh_as_container(as);
> > +
> > + return (container != NULL) && vfio_eeh_container_ok(container);
> > +}
> > +
> > +int vfio_eeh_as_op(AddressSpace *as, uint32_t op)
> > +{
> > + VFIOContainer *container = vfio_eeh_as_container(as);
> > +
> > + /* Shouldn't be called unless vfio_eeh_as_ok() returned true */
> > + assert(container);
>
> Why not just let vfio_eeh_container_op() test for this too and return
> -ENODEV? I'm generally not a fan of asserts when we could just return
> an error to the caller?
Ok, I'll change this.
>
> > + return vfio_eeh_container_op(container, op);
> > +}
> > diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h
> > index 0b26cd8..fd3933b 100644
> > --- a/include/hw/vfio/vfio.h
> > +++ b/include/hw/vfio/vfio.h
> > @@ -5,5 +5,7 @@
> >
> > extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
> > int req, void *param);
> > +bool vfio_eeh_as_ok(AddressSpace *as);
> > +int vfio_eeh_as_op(AddressSpace *as, uint32_t op);
> >
> > #endif
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
