On Wed, Apr 24, 2019 at 06:27:46PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 24, 2019 at 07:02:21PM +0800, Ming Lei wrote:
> > Hennes reported the following kernel oops:
>
> Hannes?
>
> > + if (!blk_get_queue(ns->queue)) {
> > + ret = -ENXIO;
> > + goto out_free_queue;
> > + }
>
> If we always need to hold a reference, shouldn't blk_mq_init_queue
> return with that reference held (and yes, that means changes to
> every driver, but it seems like we need to audit all of them anyway..)
The issue is driver(NVMe) specific, the race window is just between
between blk_cleanup_queue() and removing the ns from the controller namspace
list in nvme_ns_remove()
blk_mq_init_queue() does hold one refcount, and its counter-part is
blk_cleanup_queue().
It is simply ugly to ask blk_mq_init_queue() to grab a refcnt for driver,
then who is the counter-part for releasing the extra refcount?
>
> It seems like the queue lifetimes are a bit of a mess, and I'm not sure
> if this just papers over the problem.
Could you explain a bit what the mess is?
Thanks,
Ming