On Wed, May 23, 2018 at 01:45:30AM +0100, Al Viro wrote:

> Oh, bugger...
> 
> wakeup
>       removed from queue
>       schedule __aio_poll_complete()
> 
> cancel
>       grab ctx->lock
>       remove from list
> work
>       aio_complete()
>               check if it's in the list
>               it isn't, move on to free the sucker
> cancel
>       call ->ki_cancel()
>       BOOM
> 
> Looks like we want to call ->ki_cancel() *BEFORE* removing from the list,
> as well as doing fput() after aio_complete().  The same ordering, BTW, goes
> for aio_read() et.al.
> 
> Look:
> CPU1: io_cancel() grabs ->ctx_lock, finds iocb and removes it from the list.
> CPU2: aio_rw_complete() on that iocb.  Since the sucker is not in the list
> anymore, we do NOT spin on ->ctx_lock and proceed to free iocb
> CPU1: pass freed iocb to ->ki_cancel().  BOOM.

BTW, it seems that the mainline is vulnerable to this one.  I might be
missing something, but...

Reply via email to