Il 14/01/2014 15:24, Eric Farman ha scritto:
> When an unplug is triggered via QMP, the routine scsi_req_cancel
> is called to cancel any outstanding requests. However, the I/Os
> themselves were instantiated via an asynchronous call that will
> drive scsi_*_complete routines after the unplug call stack finishes.
> As all references to the request have been released by the cancel
> thread, the scsi_*_complete routines experience a range of failures
> when it attempts to manipulate the released storage.
This should never happen. See scsi_req_cancel:
void scsi_req_cancel(SCSIRequest *req)
{
trace_scsi_req_cancel(req->dev->id, req->lun, req->tag);
if (!req->enqueued) {
return;
}
scsi_req_ref(req);
scsi_req_dequeue(req);
req->io_canceled = true;
if (req->ops->cancel_io) {
req->ops->cancel_io(req);
}
if (req->bus->info->cancel) {
req->bus->info->cancel(req);
}
scsi_req_unref(req);
}
After req->ops->cancel_io returns, the following invariant must hold:
Any AIO callbacks will have been called before req->ops->cancel_io
returns, or they never will.
The invariant is also present in bdrv_aio_cancel, and should respected
at all levels: dma_aio_cancel in dma-helpers.c, thread_pool_cancel in
thread-pool.c, laio_cancel in block/linux-aio.c, and so on.
scsi_cancel_io (in hw/scsi/scsi-disk.c) is very careful in its handling
of reference counts and aiocb, with the exact purpose of triggering an
assertion failure if the invariant is not respected.
Now that I look more at the code, at least this patch is needed:
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index bce617c..ee1f5eb 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2306,6 +2306,7 @@ static const SCSIReqOps scsi_disk_emulate_reqops
.send_command = scsi_disk_emulate_command,
.read_data = scsi_disk_emulate_read_data,
.write_data = scsi_disk_emulate_write_data,
+ .cancel_io = scsi_cancel_io,
.get_buf = scsi_get_buf,
};
but it should only have an effect in very special cases, with commands
such as UNMAP, WRITE SAME or MODE SELECT.
Paolo