[Bug libobjc/47031] libobjc uses mutexes for properties

js-gcc at webkeks dot org Sat, 01 Jan 2011 04:07:26 -0800

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47031


--- Comment #3 from js-gcc at webkeks dot org <js-gcc at webkeks dot org> 
2011-01-01 12:06:56 UTC ---
> The problem is that property accessors are basically general purpose routines
that may be used in the most varied situations.

It does not matter very much in which situation a property is used. To chose
which type of lock you use, it's only important what is done while the lock is
held. In this case, no call to the kernel-space is made at all and only a small
operation is done. Switching to kernel-space for a mutex is already way more
complex than what we do in the lock. If I'd have to guess, I'd say switching to
kernel-space is at least 100 times more expensive than what we do.

> So, we have very little control or knowledge over when and how they are used 
> --

Which we don't care about at all.

>  * we don't know how many CPUs or cores the user has

Does not really matter. If we have two cores, the spinlock can give control to
another thread after 10 spins using sched_yield().

So, if we only have one core and one thread spins because it waits for another
core to release the lock, then we waste at maximum 10 tries. This is the
worst-case scenario.

If we have more than one core, we most likely have another thread releasing the
lock before it even spinned 10 times.

So, no matter how many cores, it does not perform worse than a mutex (at least
not in a measurable way), while on systems with many cores, it's a huge
improvement. Plus changing a property is something that's so fast that we most
likely will never encounter a locked spinlock. That'd only happen if the
scheduler gave control to another thread before the property was changed.

So, with spinlocks, in 99% of the cases, it's not even measurable.
With mutexes, in 100% of the cases, it IS measurable.

>  * we don't know how many threads the user is starting
>  * we don't know how many threads are sharing a CPU or core

We don't really care about them, I think.

>  * we don't know how intensively the user is using the property accessors

So, because we don't know how intensively the user is using properties, we will
make them slow on purpose?

> Spinlocks are appropriate when certain conditions are met; but in this case,
> it seems impossible to be confident that these are met. 

Which conditions are not met in your opinion? Please list the conditions that
you think are not met, as Apple clearly thinks they are all met. And so do I.

> A user may write a
> program with 3 or 4 threads running on his 1 CPU/core machine, which 
> constantly
> read/write an atomic synthesized property to synchronize between themselves. 
> Why not; but then, spinlocks would actually degrade performance instead of
> improving it.

This is actually why you call sched_yield() after 10 spins. It prevents a
thread from being stuck spinning while another thread could release the lock.


> Traditional locks may be slower if you a low contention case, but work
> consistently OK in all conditions.

Yes, they are the same in all conditions because they are always more complex
and slower ;).

> * spinlocks are better/faster if there is low contention and very little
> chance that two threads enter the critical region (inside the accessors) at 
> the
> same time.

This is the case here.

> * the difference in performance between mutexes and spinlocks only matters in
> the program performance if the accessors are called very often.

If you init a lot of objects and those initialize let's say 30 variables using
properties, then this means that 30 locks are retained and released, although
no other thread could possibly access it. But still you do 30
userland-kernelspace-switches. For a single object! Now create 1000 objects.

With spinlocks, there won't be a single userland-kernelspace-switch!

Just to demonstrate that we are talking about something which really can make a
huge difference…

I think the percentages you list cannot be used at all, as we don't have
applications just doing some math calculations and then quitting. We don't want
something slow just because it might only be a small part of the program. We
want everything to be as fast as possible. Otherwise it sums up and makes a
crappy user experience for interactive applications. Apple demonstrated this
quite well if you compare how crappy it felt a few years ago and how well it
feels now that they started optimizing the small stuff as well.

> The only case where spinlocks really help is if the program spends lots of 
> time
calling accessors, and is not multi-threaded.  In which case, the programmer
could get a huge speed-up by simply declaring the properties non-atomic.

Even in a threaded environment, it would make a huge difference. It's unlikely
the lock is held. Only if it is held, you need some CPU time. But with Mutexes,
each time you only check if the lock is held, you already switch to
kernel-space.

> Would using spinlocks make
> accessors 2x faster ? 10x faster ? 10% faster ?

My guess is that usually the spinlock is not held, so I could imagine factor
100 faster or even factor 1000. I remember having had some test some while ago
where I tried just locking and releasing a mutex and a spinlock and doing one
arithmetic operation. While the version with spinlocks took only a few seconds,
the version with mutexes still was not finished after a few hours, which was
when I aborted it.

[Bug libobjc/47031] libobjc uses mutexes for properties

Reply via email to