On Sun, Jan 18, 2009 at 11:04:53AM +0100, Matthias Kilian wrote:
> On Sat, Jan 17, 2009 at 11:21:03PM +0000, Ian Lynagh wrote:
> > OK, that mostly worked, but ffi014 timed out.
> > 
> > It turns out that "hello world" linked with -threaded -debug deadlocks
> > at:
> >     ASSERT_LOCK_HELD(&sched_mutex);
> > at the start of newBoundTask. This is doing
> >     ASSERT(pthread_mutex_lock(&sched_mutex) == EDEADLK)
> > which requires us to be using error-checking mutexes. However, initMutex
> > says:
> >     void
> >     initMutex(Mutex* pMut)
> >     {
> >     #if defined(DEBUG)
> >         pthread_mutexattr_t attr;
> >         pthread_mutexattr_init(&attr);
> >         pthread_mutexattr_settype(&attr,PTHREAD_MUTEX_ERRORCHECK_NP);
> >         pthread_mutex_init(pMut,&attr);
> >     #else
> >         pthread_mutex_init(pMut,NULL);
> >     #endif
> >         return;
> >     }
> 
> Err... whats the special meaning of PTHREAD_MUTEX_ERRORCHECK_NP?
> It doesn't exist on OpenBSD and it isn't mentioned in IEEE Std
> 1003.1.

Ah, sorry, should have said: there's nothing special about the "NP" bit:

    PTHREAD_MUTEX_ERRORCHECK_NP for ‘‘error checking’’ mutexes.  As the  NP
    suffix  indicates,  this is a non-portable extension to the POSIX stan‐
    dard and should not be employed in portable programs.

In fact, looking at pthread.h on Linux:

#ifdef __USE_UNIX98
  ,
  PTHREAD_MUTEX_NORMAL = PTHREAD_MUTEX_TIMED_NP,
  PTHREAD_MUTEX_RECURSIVE = PTHREAD_MUTEX_RECURSIVE_NP,
  PTHREAD_MUTEX_ERRORCHECK = PTHREAD_MUTEX_ERRORCHECK_NP,
  PTHREAD_MUTEX_DEFAULT = PTHREAD_MUTEX_NORMAL
#endif

so maybe we can just use PTHREAD_MUTEX_ERRORCHECK everywhere.

> > so on OS X we were using the default normal/fast mutexes. OS X does have
> > PTHREAD_MUTEX_ERRORCHECK (and validate passes if we use it, apart from
> > hpc_ghc_ghci working on OS X), but I don't know what other OSes use, or
> > if they have no equivalent.
> 
> On OpenBSD, PTHREAD_MUTEX_ERRORCHECK is the default (if you pass
> NULL as the second argument to pthread_mutex_init()), and

"The default kind is  ‘‘fast’’." on Linux.

> I think the ASSERT_LOCK_HELD is the problem, at least here on
> OpenBSD. ASSERT_LOCK_HELD tries to obtain a lock on a mutex that
> already is locked, and will either block or err out, depending of
> the mutex type (PTHREAD_MUTEX_ERRORCHECK or PTHREAD_MUTEX_NORMAL).
> It will do so even if the thread calling pthread_mutex_lock() owns
> the lock on the mutex. For example, this one will block at the
> second call to pthread_mutex_lock:
> 
>       pthread_mutexattr_t a;
>       pthread_mutex_t m;
> 
>       pthread_mutexattr_init(&a);
>       pthread_mutexattr_settype(&a, PTHREAD_MUTEX_NORMAL);
>       pthread_mutex_init(&m, &a);
>       pthread_mutex_lock(&m);
>       pthread_mutex_lock(&m); /* blocking here */
> 
> IMHO there's no portable and clean way to check wether you already
> have a lock on a mutex or not (but I didn't yet have breakfast, so
> this statement may be wrong). Regardless of wether you're trying
> pthread_mutex_lock with a mutex type of PTHREAD_MUTEX_ERRORCHECK
> or wether you're trying pthread_mutex_trylock (which never blocks),
> you get the wrong result, at least on OpenBSD.

Really? FWIW, my Linux manpage says:
    If the mutex is of the ‘‘error checking’’ kind, pthread_mutex_lock
    returns immediately with the error code  EDEADLK.

What "wrong result" do you get with PTHREAD_MUTEX_ERRORCHECK and
pthread_mutex_lock?

> Even on Linux, ASSERT_LOCK_HELD may have undesired effects: if you
> call it on an unlocked mutex, you will get a positive result and
> at the same time have locked the mutex.

And the assert (that the result was EDEADLK) will fail, so the program
terminates.

> The only way to implement ASSERT_LOCK_HELD I can think at the moment
> would be something like
> 
> #define ASSERT_LOCK_HELD(m) \
>       ASSERT(pthread_mutex_unlock(m) == 0 && pthread_mutex_lock(m) == 0)
> 
> which would introduce the risk that some other thread obtains the
> lock between the two calls and causes the thread with the ASSERT
> beeing blocked on pthread_mutex_lock() until the mutex is unlocked
> again.

Yes, that is definitely not a good idea.


Thanks
Ian

_______________________________________________
Cvs-ghc mailing list
Cvs-ghc@haskell.org
http://www.haskell.org/mailman/listinfo/cvs-ghc

Reply via email to