On Fri, Jul 20, 2018 at 10:00:22AM +0100, Jonathan Kew wrote:
+1 to that. Our need for OMT access to prefs is only likely to grow, IMO, and we should just fix it once, so that any code (regardless of which thread(s) it may eventually run on) can trivially read prefs.

Even if that means we can't adopt Robin Hood hashing, I think the trade-off would be well worthwhile.

This is exactly the kind of performance footgun I'm talking about. The marginal cost of making something threadsafe may be low, but those costs pile up. The cost of locking in hot code often adds up enough on its own. Throwing away an important optimization on top of that adds up even faster. Getting into the habit, and then continuing the pattern elsewhere starts adding up exponentially.

Threads are great. Threadsafe code is useful. But data that's owned by a single thread is still the best when you can manage it, and it should be the default option whenever we can reasonably manage it. We already pay the price of being overzealous about thread safety in other areas (for instance, the fact that all of our strings require atomic refcounting, even though DOM strings are generally only used by a single thread). I think the trend needs to be in the opposite direction.

Rust and JS make this much easier than C++ does, unfortunately. But we're getting better at it in C++, and moving more and more code to Rust.

On Thu, Jul 19, 2018 at 2:19 PM, Kris Maglione <kmagli...@mozilla.com> wrote:
On Tue, Jul 17, 2018 at 03:49:41PM -0700, Jeff Gilbert wrote:

We should totally be able to afford the very low cost of a
rarely-contended lock. What's going on that causes uncached pref reads
to show up so hot in profiles? Do we have a list of problematic pref
keys?


So, at the moment, we read about 10,000 preferences at startup in debug
builds. That number is probably slightly lower in non-debug builds, bug we
don't collect stats there. We're working on reducing that number (which is
why we collect statistics in the first place), but for now, it's still quite
high.


As for the cost of locks... On my machine, in a tight loop, the cost of a
entering and exiting MutexAutoLock is about 37ns. This is pretty close to
ideal circumstances, on a single core of a very fast CPU, with very fast
RAM, everything cached, and no contention. If we could extrapolate that to
normal usage, it would be about a third of a ms of additional overhead for
startup. I've fought hard enough for 1ms startup time improvements, but
*shrug*, if it were that simple, it might be acceptable.

But I have no reason to think the lock would be rarely contended. We read
preferences *a lot*, and if we allowed access from background threads, I
have no doubt that we would start reading them a lot from background threads
in addition to reading them a lot from the main thread.

And that would mean, in addition to lock contention, cache contention and
potentially even NUMA issues. Those last two apply to atomic var caches too,
but at least they generally apply only to the specific var caches being
accessed off-thread, rather than pref look-ups in general.


Maybe we could get away with it at first, as long as off-thread usage
remains low. But long term, I think it would be a performance foot-gun. And,
paradoxically, the less foot-gunny it is, the less useful it probably is,
too. If we're only using it off-thread in a few places, and don't have to
worry about contention, why are we bothering with locking and off-thread
access in the first place?
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to