On 11/27/2011 02:53 PM, Alan Modra wrote:
> +enum memmodel
> +{
> + MEMMODEL_RELAXED = 0,
> + MEMMODEL_CONSUME = 1,
> + MEMMODEL_ACQUIRE = 2,
> + MEMMODEL_RELEASE = 3,
> + MEMMODEL_ACQ_REL = 4,
> + MEMMODEL_SEQ_CST = 5,
> + MEMMODEL_LAST = 6
> +};
This should probably go to libgomp.h.
> /* This is a Linux specific implementation of a semaphore synchronization
> mechanism for libgomp. This type is private to the library. This
> + counting semaphore implementation uses atomic instructions and the
> + futex syscall, and a single 32-bit int to store semaphore state.
> + The low 31 bits are the count, the top bit is a flag set when some
> + threads may be waiting. */
I think we'll get better code generation on a number of targets if we make the
low bit the waiting bit, and the higher bits the count. This we do (count &
1) and (count + 2) instead of larger constants needing to be generated. Not
changed below...
> +static inline void
> +gomp_sem_wait (gomp_sem_t *sem)
> {
> + int count = *sem;
> +
> + while ((count & 0x7fffffff) != 0)
> + {
> + int oldval = count;
> + __atomic_compare_exchange_4 (sem, &oldval, count - 1,
> + false, MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
> + if (__builtin_expect (oldval == count, 1))
> + return;
> + count = oldval;
> + }
I'd really prefer not to hard-code any sizes at all. Let's change the explicit
_4 to _n everywhere.
The loop above doesn't actually make sense to me. If the compare-and-swap
succeeds, then oldval == count - 1. Which doesn't seem to be what you're
testing at all.
Perhaps this entire function is better written as
{
int count = *sem;
do
{
if (__builtin_expect (count & 0x80000000u, 0)
{
gomp_sem_wait_slow (sem, count);
return;
}
}
while (!__atomic_compare_exchange_n (sem, &count, count - 1, true,
MEMMODEL_ACQUIRE, MEMMODEL_RELAXED));
}
> +gomp_sem_post (gomp_sem_t *sem)
> {
> + int count = *sem;
> +
> + while (1)
> + {
> + int oldval = count;
> + __atomic_compare_exchange_4 (sem, &oldval, (count + 1) & 0x7fffffff,
> + false, MEMMODEL_RELEASE, MEMMODEL_RELAXED);
> + if (__builtin_expect (oldval == count, 1))
> + break;
Again, this equality doesn't make sense. Further, you aren't testing for the
high bit set inside the loop, nor are you preserving the high bit.
Perhaps better written as
{
int count = *sem;
do
{
if (__builtin_expect (count & 0x80000000u, 0))
{
gomp_sem_post_slow (sem, count);
return;
}
/* We can't post more than 2**31-1 times. */
assert (count < 0x7fffffff);
}
while (!__atomic_compare_exchange_n (sem, &count, count + 1, true,
MEMMODEL_RELEASE, MEMMODEL_RELAXED));
}
... All that said, I don't see how we can ever clear the wait bit once its set?
Are we better off splitting the 32-bit value into two 16-bit fields for
value+waiters? Or similarly splitting a 64-bit value? That way we can at
least update both fields atomically, and we ought never lose a waiter.
r~