On 17-01-13 09:23 AM, Michael S. Tsirkin wrote: > On Thu, Jan 12, 2017 at 01:45:19PM -0800, John Fastabend wrote: >> Add support for XDP adjust head by allocating a 256B header region >> that XDP programs can grow into. This is only enabled when a XDP >> program is loaded. >> >> In order to ensure that we do not have to unwind queue headroom push >> queue setup below bpf_prog_add. It reads better to do a prog ref >> unwind vs another queue setup call. >> >> At the moment this code must do a full reset to ensure old buffers >> without headroom on program add or with headroom on program removal >> are not used incorrectly in the datapath. Ideally we would only >> have to disable/enable the RX queues being updated but there is no >> API to do this at the moment in virtio so use the big hammer. In >> practice it is likely not that big of a problem as this will only >> happen when XDP is enabled/disabled changing programs does not >> require the reset. There is some risk that the driver may either >> have an allocation failure or for some reason fail to correctly >> negotiate with the underlying backend in this case the driver will >> be left uninitialized. I have not seen this ever happen on my test >> systems and for what its worth this same failure case can occur >> from probe and other contexts in virtio framework. >> >> Signed-off-by: John Fastabend <john.r.fastab...@intel.com> >> ---
[...] >> >> +#define VIRTIO_XDP_HEADROOM 256 >> + >> +static int init_vqs(struct virtnet_info *vi); >> +static void remove_vq_common(struct virtnet_info *vi, bool lock); >> + >> +/* Reset virtio device with RTNL held this is very similar to the >> + * freeze()/restore() logic except we need to ensure locking. It is >> + * possible that this routine may fail and leave the driver in a >> + * failed state. However assuming the driver negotiated correctly >> + * at probe time we _should_ be able to (re)negotiate driver again. >> + */ >> +static int virtnet_xdp_reset(struct virtnet_info *vi) >> +{ >> + struct virtio_device *vdev = vi->vdev; >> + unsigned int status; >> + int i, ret; >> + >> + /* Disable and unwind rings */ >> + virtio_config_disable(vdev); >> + vdev->failed = vdev->config->get_status(vdev) & VIRTIO_CONFIG_S_FAILED; >> + >> + netif_device_detach(vi->dev); > > After this point, netif_device_present > will return false, and then we have a bunch of code > that does > if (!netif_device_present(dev)) > return -ENODEV; > > > so we need to audit this code to make sure it's > all called under rtnl, correct? > Correct. In the XDP case it is. > We don't want it to fail because of timing. > > Maybe add an assert there. > I can add an assert here to ensure it doesn't ever get refactored out or something. > >> + cancel_delayed_work_sync(&vi->refill); >> + if (netif_running(vi->dev)) { >> + for (i = 0; i < vi->max_queue_pairs; i++) >> + napi_disable(&vi->rq[i].napi); >> + } >> + >> + remove_vq_common(vi, false); >> + >> + /* Do a reset per virtio spec recommendation */ >> + vdev->config->reset(vdev); >> + >> + /* Acknowledge that we've seen the device. */ >> + status = vdev->config->get_status(vdev); >> + vdev->config->set_status(vdev, status | VIRTIO_CONFIG_S_ACKNOWLEDGE); >> + >> + /* Notify driver is up and finalize features per specification. The >> + * error code from finalize features is checked here but should not >> + * fail because we assume features were previously synced successfully. >> + */ >> + status = vdev->config->get_status(vdev); >> + vdev->config->set_status(vdev, status | VIRTIO_CONFIG_S_DRIVER); >> + ret = virtio_finalize_features(vdev); > > I'd rather we put all this in virtio core, maybe call it virtio_reset or > something. At first I started to do this but decided it was easier to open code it I was on the fence though so if we think it would be cleaner then I will do it. The trick is needs to be broken down into two pieces, something like the following, virtio_reset() { do_generic_down_part -> pci pieces vdev->config->down() -> do down part of device specifics do_generic_up_part vdev->config->up() -> do up part of device specifics do_finalize_part } Alternatively we could reuse the freeze/restore device callbacks but those make assumptions about locking. So we could pass a context through but per Stephen's comment that gets a bit fragile. And sparse doesn't like it either apparently. I think making it an explicit down/up reset callback might make it clean and reusable for any other devices. Any thoughts? My preference outside of open coding it is the new down_reset and up_reset callbacks. > >> + if (ret) { >> + netdev_warn(vi->dev, "virtio_finalize_features failed during >> reset aborting\n"); >> + goto err; >> + } >> + >> + ret = init_vqs(vi); >> + if (ret) { >> + netdev_warn(vi->dev, "init_vqs failed during reset aborting\n"); >> + goto err; >> + } >> + virtio_device_ready(vi->vdev); >> + >> + if (netif_running(vi->dev)) { >> + for (i = 0; i < vi->curr_queue_pairs; i++) >> + if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL)) >> + schedule_delayed_work(&vi->refill, 0); >> + >> + for (i = 0; i < vi->max_queue_pairs; i++) >> + virtnet_napi_enable(&vi->rq[i]); >> + } >> + netif_device_attach(vi->dev); >> + /* Finally, tell the device we're all set */ >> + status = vdev->config->get_status(vdev); >> + vdev->config->set_status(vdev, status | VIRTIO_CONFIG_S_DRIVER_OK); >> + virtio_config_enable(vdev); >> + >> + return 0; >> +err: >> + status = vdev->config->get_status(vdev); >> + vdev->config->set_status(vdev, status | VIRTIO_CONFIG_S_FAILED); > > And maybe virtio_fail ? > Sure. Thanks, John