Thread starvation and resource saturation in atomicity functions?

2008-03-31 Thread Chad Attermann


Hello all.  Late last year I posted a couple of questions about 
multi-threaded application hangs in Solaris 10 for x86 platforms, and about 
thread-safety of std::basic_string in general.  This was an attempt to solve 
persistent problems I have been experiencing with my application hanging due 
to CPU utilization shooting to 100%, with the __gnu_cxx::__exchange_and_add 
function frequently making appearances at the top of the stack trace of 
several threads.


I believe I have made a break-through recently and wanted to solicit the 
opinion of some experts on this.  I seem to have narrowed the problem down 
to running my application as root versus an unprivileged user, and further 
isolated the suspected cause to varying thread priorities in my application. 
I have theorized that spin-locks in gcc, particularly in the atomicity 
__gnu_cxx::__exchange_and_add function, are causing higher priority threads 
to consume all available cpu cycles while spinning indefinitely waiting for 
a lower priority thread that holds the lock.  Now I am already aware that 
messing with thread priorities is dangerous and often an excercise in 
futility, but I am surprised that something so elemental as an atomic 
test-and-set operation that may be used extensively throughout gcc could 
possibly be the culprit for all of the trouble I have been experiencing.


More than anything I'm hoping for a sanity check on this, even if it's just 
to confirm what may be obvious to others; that modifying thread priorities 
is strictly off-limits except in extreme circumstances with careful control 
over what operations are performed.  Or perhaps there's another solution 
that has eluded my searches, maybe a bug fix or some way of avoiding such 
spin-locks in gcc making varying thread-priorities viable and safe.


Thanks in advance for any insight, and at the very least I hope that this 
will serve as a warning to others who might find themselves in the same 
situation.


Cheers,

Chad Attermann



Re: Thread starvation and resource saturation in atomicity functions?

2008-03-31 Thread Chad Attermann


"Ian Lance Taylor" <[EMAIL PROTECTED]> writes:



"Chad Attermann" <[EMAIL PROTECTED]> writes:


Hello all.  Late last year I posted a couple of questions about
multi-threaded application hangs in Solaris 10 for x86 platforms, and
about thread-safety of std::basic_string in general.  This was an
attempt to solve persistent problems I have been experiencing with my
application hanging due to CPU utilization shooting to 100%, with the
__gnu_cxx::__exchange_and_add function frequently making appearances
at the top of the stack trace of several threads.

I believe I have made a break-through recently and wanted to solicit
the opinion of some experts on this.  I seem to have narrowed the
problem down to running my application as root versus an unprivileged
user, and further isolated the suspected cause to varying thread
priorities in my application. I have theorized that spin-locks in gcc,
particularly in the atomicity __gnu_cxx::__exchange_and_add function,
are causing higher priority threads to consume all available cpu
cycles while spinning indefinitely waiting for a lower priority thread
that holds the lock.  Now I am already aware that messing with thread
priorities is dangerous and often an excercise in futility, but I am
surprised that something so elemental as an atomic test-and-set
operation that may be used extensively throughout gcc could possibly
be the culprit for all of the trouble I have been experiencing.


You explicitly mentioned x86.  For x86, __gnu_cxx::__exchange_and_add
does not use a spin-lock.

If you mean that other code may use spin locks built on top of
__exchange_and_add, then, yes, in that case you could be getting a
priority inversion.  But gcc itself does not use any such code.  So if
you are seeing a problem of this sort, it is not a problem with gcc.



I doubted at first too, but from what I can tell, the version of gcc that 
ships with Solaris 10 x86 is gcc 3.4.3 for i386.  Below is the output of the 
pre-installed gcc given the -v switch:


   Reading specs from /usr/sfw/lib/gcc/i386-pc-solaris2.10/3.4.3/specs
   Configured with: 
/builds/sfw10-gate/usr/src/cmd/gcc/gcc-3.4.3/configure --prefix=/usr/sfw --with-as=/usr/sfw/bin/gas 
--with-gnu-as --with-ld=/usr/ccs/bin/ld --without-gnu-ld --enable-languages=c,c++ 
--enable-shared

   Thread model: posix
   gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath)


Below is a snippet of code from gcc 3.4.3 source 
./libstdc++-v3/config/cpu/i386/atomicity.h:


 _Atomic_word
 __attribute__ ((__unused__))
 __exchange_and_add(volatile _Atomic_word* __mem, int __val)
 {
   register _Atomic_word __result, __tmp = 1;

   // Obtain the atomic exchange/add spin lock.
   do
 {
   __asm__ __volatile__ ("xchg{l} {%0,%1|%1,%0}"
 : "=m" 
(_Atomicity_lock<0>::_S_atomicity_lock),

 "+r" (__tmp)
 : "m" 
(_Atomicity_lock<0>::_S_atomicity_lock));

 }
   while (__tmp);

   __result = *__mem;
   *__mem += __val;

   // Release spin lock.
   _Atomicity_lock<0>::_S_atomicity_lock = 0;

   return __result;
 }

I can not confirm that it was the i386 code included in the gcc build but it 
appears that way from the signature.  Is this perhaps a problem with the way 
that gcc 3.4.3 shipping with Solaris 10 x86 was built?  Should it have opted 
for the i486 version instead that does not use spin-locks?




Re: Thread starvation and resource saturation in atomicity functions?

2008-04-01 Thread Chad Attermann


"Ian Lance Taylor" <[EMAIL PROTECTED]> writes:



"Chad Attermann" <[EMAIL PROTECTED]> writes:


I can not confirm that it was the i386 code included in the gcc build
but it appears that way from the signature.  Is this perhaps a problem
with the way that gcc 3.4.3 shipping with Solaris 10 x86 was built?
Should it have opted for the i486 version instead that does not use
spin-locks?


Yes, barring the extremely unlikely case that you need to run a plain
i386, you should use the i486 code from libstdc++-v3/config/cpu/i486.
There are various difficulties with the i386 atomicity code.
Fortunately the i486 was released almost 20 years ago, and it's
generally safe to use the i486 instructions today.

Ian


Running at least i486 code would make sense on AMD Opteron processors.  I am 
shocked that the gcc version shipped by Sun Microsystems would be compiled 
for i386.  I compiled my own version of gcc 4.2.2 n the same platform and it 
too appears to have used i386 code.  Perhaps the gcc build configuration 
process for Solaris is flawed?  Regardless I will be attempting to build a 
new version today that is forced to use the i486 code.  Would apprecite if 
you have any tips.


Thanks,

Chad.



Re: Thread starvation and resource saturation in atomicity functions?

2008-04-01 Thread Chad Attermann


"Chad Attermann" <[EMAIL PROTECTED]> writes:

Running at least i486 code would make sense on AMD Opteron processors.  I 
am shocked that the gcc version shipped by Sun Microsystems would be 
compiled for i386.  I compiled my own version of gcc 4.2.2 n the same 
platform and it too appears to have used i386 code.  Perhaps the gcc build 
configuration process for Solaris is flawed?  Regardless I will be 
attempting to build a new version today that is forced to use the i486 
code.  Would apprecite if you have any tips.


My bad... I was mistakenly thinking I needed re-build gcc in order to get 
i486 code.  In reality I should only need to specify the architecture type 
when building my own application using "-march=i486", or perhaps even 
"-march=opteron" in my own case.


As stated in gcc docs, i386 is the default instruction set for "i386 and 
x86-64 family of computers" when the architecture is not explicitly defined, 
so presumably atomic test-and-set operations will use spin-locks by default. 
So I suppose the moral of the story remains... excercise extreme caution 
when using varying thread priorities.


Regards.