On 11/27/2011 02:53 PM, Alan Modra wrote:
> +enum memmodel
> +{
> +  MEMMODEL_RELAXED = 0,
> +  MEMMODEL_CONSUME = 1,
> +  MEMMODEL_ACQUIRE = 2,
> +  MEMMODEL_RELEASE = 3,
> +  MEMMODEL_ACQ_REL = 4,
> +  MEMMODEL_SEQ_CST = 5,
> +  MEMMODEL_LAST = 6
> +};

This should probably go to libgomp.h.

>  /* This is a Linux specific implementation of a semaphore synchronization
>     mechanism for libgomp.  This type is private to the library.  This 
> +   counting semaphore implementation uses atomic instructions and the
> +   futex syscall, and a single 32-bit int to store semaphore state.
> +   The low 31 bits are the count, the top bit is a flag set when some
> +   threads may be waiting.  */

I think we'll get better code generation on a number of targets if we make the 
low bit the waiting  bit, and the higher bits the count.  This we do (count & 
1) and (count + 2) instead of larger constants needing to be generated.  Not 
changed below...

> +static inline void
> +gomp_sem_wait (gomp_sem_t *sem)
>  {
> +  int count = *sem;
> +
> +  while ((count & 0x7fffffff) != 0)
> +    {
> +      int oldval = count;
> +      __atomic_compare_exchange_4 (sem, &oldval, count - 1,
> +                                false, MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
> +      if (__builtin_expect (oldval == count, 1))
> +     return;
> +      count = oldval;
> +    }

I'd really prefer not to hard-code any sizes at all.  Let's change the explicit 
_4 to _n everywhere.

The loop above doesn't actually make sense to me.  If the compare-and-swap 
succeeds, then oldval == count - 1.  Which doesn't seem to be what you're 
testing at all.

Perhaps this entire function is better written as

{
  int count = *sem;
  do
    {
      if (__builtin_expect (count & 0x80000000u, 0)
        {
          gomp_sem_wait_slow (sem, count);
          return;
        }
    }
  while (!__atomic_compare_exchange_n (sem, &count, count - 1, true,
                                       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED));
}

> +gomp_sem_post (gomp_sem_t *sem)
>  {
> +  int count = *sem;
> +
> +  while (1)
> +    {
> +      int oldval = count;
> +      __atomic_compare_exchange_4 (sem, &oldval, (count + 1) & 0x7fffffff,
> +                                false, MEMMODEL_RELEASE, MEMMODEL_RELAXED);
> +      if (__builtin_expect (oldval == count, 1))
> +     break;

Again, this equality doesn't make sense.  Further, you aren't testing for the
high bit set inside the loop, nor are you preserving the high bit.

Perhaps better written as

{
  int count = *sem;
  do
    {
      if (__builtin_expect (count & 0x80000000u, 0))
        {
          gomp_sem_post_slow (sem, count);
          return;
        }
      /* We can't post more than 2**31-1 times.  */
      assert (count < 0x7fffffff);
    }
  while (!__atomic_compare_exchange_n (sem, &count, count + 1, true,
                                       MEMMODEL_RELEASE, MEMMODEL_RELAXED));
}

... All that said, I don't see how we can ever clear the wait bit once its set? 
 Are we better off splitting the 32-bit value into two 16-bit fields for 
value+waiters?  Or similarly splitting a 64-bit value?  That way we can at 
least update both fields atomically, and we ought never lose a waiter.


r~

Reply via email to