Re: nsThread now leaks runnables if dispatch fails

Randell Jesup Mon, 04 Jan 2016 16:10:53 -0800

>Kyle Huey writes:
>
>> (This is a continuation of the discussion in bug 1218297)
>>
>> In bug 1155059 we made nsIEventTarget::Dispatch take an
>> already_AddRefed instead of a raw pointer.  This was done to allow the
>> dispatcher to transfer its reference to the runnable to the thread the
>> runnable will run on.  That solves a race condition we've had
>> historically where the destructor of a runnable can run on either the
>> dispatching thread or the executing thread, depending on whether the
>> executing thread can run the event to completion before the
>> dispatching thread destroys the nsCOMPtr on the stack.
>
>IIUC solving the race condition wasn't the goal of the change in
>API, but something that was done to retain existing workarounds
>for leaks.


Solving the "who knows what thread this will be released on" was a
primary goal.  See comment 0, and the discussion here it references.

Independently, bsmedberg wanted to make Dispatch infallible (at least
normally), thus making the pattern of "avoid leak in the case of
Dispatch failing" irrelevant (once done, which it hasn't been).

I started the conversion of Dispatch(rawptr) in the tree to
Dispatch(already_AddRefed<>); xidorn took over finishing it but hasn't
yet.  We should re-energize that.

There's considerable discussion in the bug of when leaks occur and also
the assumed behavior of DispatchToMainThread (which is especially
failure-prone because of how XPCOM dispatch works - even when MainThread
still exists, that can fail in shutdown).

>> In bug 1218297 we saw a case where dispatch to a thread (the socket
>> transport service thread in this case) fails because the thread has
>> already shut down.  In our brave new world, nsThread simply leaks the
>> runnable.

Yup.  In cases where we anticipate a possible Dispatch failure (which is
supposed to become impossible, but isn't currently) you can use the
(still-existing) raw ptr interface and handle Dispatch failure
explicitly to release the (leaked) reference, if it's safe.  Not all
cases are safe to release in that case (which is what drove the initial
bug filing, where it tried to release JS objects on Dispatch failure off
mainthread).  Leaking is better than crashing/sec-bugs.

If the problem is that when this happens, a leak is reported by infra,
then we could ping the leak-checker that there was a dispatch failure
and leaks are expected.  Even better but maybe not possible would be
annotate the root of the leak and suppress anything tied to it.

>> It can't release the reference it holds, because that would
>> reintroduce the race condition we wanted to avoid, and it can't
>> release the reference on the correct thread so it's already gone away.
>> But now we have a new source of intermittent leaks.
>>
>> Was this anticipated when designing bug 1155059?  I don't think
>> leaking is acceptable here, so we may need to back that out and return
>> to living with that race condition.
>
>I agree this approach is not going to work out.  Short term, I
>think the situation could be improved and most of those changes
>kept by adding a DispatchOrLeak() variant that can be used by the
>callers that want to leak.  Then we still have the leaks we had
>prior to 2265e031ab97 but the new ones are resolved.
>
>For the remaining (old) leaks I can think of two options:
>
>1) Never call Dispatch() when it might fail and assert that it
>   does not.
>
>2) Keep and additional reference to the runnable in the caller if
>   it doesn't want Dispatch() to release the last reference.
>
>With the former, the caller needs to ensure the thread lives long
>enough for the Dispatch() to succeed, or that clean up happens
>before the thread is closed.  With the latter, the caller needs to
>find another way to release when Dispatch() fails.
>
>Making Dispatch() infallible would permit only option 1.  Keeping
>Dispatch() fallible allows the caller to use either option.

There's another (likely impractical) option: wrap all thread-sensitive
pointers in a ReleaseRefOnThread<> holder which refuses to release
objects on the wrong thread and RELEASE_ASSERTS, and integrate that with
Dispatch's failure path to not assert if you try to release it due to a
Dispatch failure and instead leak and mark it somehow for automation.

Leaks are annoying, but only because of automation meant to catch
persistent/accumulating leaks - shutdown-time-only leaks are far less
concerning, and (waving hands) 99% of Dispatch()-failures that create
leaks are in shutdown.

Also - blocking event processing and DispatchToMainthread before the
final cycle-collect adds to the problem since anything released in the
final CC pass can't DispatchToMainthread; this has forced a number of
tricks or bits of code to handle Dispatch failure.

Once upon a time we didn't do very much with other threads, and also
didn't pass JS references around, etc.  We use threads a lot more and in
much more complex ways now.

As Nathan said in the bug:
> Well, the runnables themselves are certainly safe to refcount
> anywhere, but the contents of the runnables might not be...

Sec bugs and crashes are worse than leaks, and they can occur
if we have non-testable random off-thread releases, which is what we
had, and still do for things not switched to already_AddRefed<>.

I'd be good with flagging/asserting on Dispatch failures *before* shutdown
begins as a step to making it infallible, since those have much more
potential to be logic flaws and/or create accumulating leaks.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: nsThread now leaks runnables if dispatch fails

Reply via email to