Hello all. Late last year I posted a couple of questions about
multi-threaded application hangs in Solaris 10 for x86 platforms, and about
thread-safety of std::basic_string in general. This was an attempt to solve
persistent problems I have been experiencing with my application hanging due
to CPU utilization shooting to 100%, with the __gnu_cxx::__exchange_and_add
function frequently making appearances at the top of the stack trace of
several threads.
I believe I have made a break-through recently and wanted to solicit the
opinion of some experts on this. I seem to have narrowed the problem down
to running my application as root versus an unprivileged user, and further
isolated the suspected cause to varying thread priorities in my application.
I have theorized that spin-locks in gcc, particularly in the atomicity
__gnu_cxx::__exchange_and_add function, are causing higher priority threads
to consume all available cpu cycles while spinning indefinitely waiting for
a lower priority thread that holds the lock. Now I am already aware that
messing with thread priorities is dangerous and often an excercise in
futility, but I am surprised that something so elemental as an atomic
test-and-set operation that may be used extensively throughout gcc could
possibly be the culprit for all of the trouble I have been experiencing.
More than anything I'm hoping for a sanity check on this, even if it's just
to confirm what may be obvious to others; that modifying thread priorities
is strictly off-limits except in extreme circumstances with careful control
over what operations are performed. Or perhaps there's another solution
that has eluded my searches, maybe a bug fix or some way of avoiding such
spin-locks in gcc making varying thread-priorities viable and safe.
Thanks in advance for any insight, and at the very least I hope that this
will serve as a warning to others who might find themselves in the same
situation.
Cheers,
Chad Attermann
- Thread starvation and resource saturation in atomicity fu... Chad Attermann
-