Re: mtx_enter_try(9) & recursion

Martin Pieuchot Sun, 02 Jun 2019 11:40:58 -0700

On 02/06/19(Sun) 15:50, Visa Hankala wrote:
> On Sun, Jun 02, 2019 at 10:29:01AM -0300, Martin Pieuchot wrote:
> > On 02/06/19(Sun) 04:30, Visa Hankala wrote:
> > > On Sat, Jun 01, 2019 at 07:04:23PM -0300, Martin Pieuchot wrote:
> > > > On 01/06/19(Sat) 23:22, Mark Kettenis wrote:
> > > > > > Date: Sat, 1 Jun 2019 17:32:52 -0300
> > > > > > From: Martin Pieuchot <m...@openbsd.org>
> > > > > > 
> > > > > > Currently it isn't safe to call mtx_enter_try(9) if you're already
> > > > > > holding the mutex.  That means it isn't safe to call that function
> > > > > > in hardclock(9), like with `windup_mtx'.  That's why the mutex needs
> > > > > > to be initialized as IPL_CLOCK.
> > > > > > 
> > > > > > I'm working on removing the SCHED_LOCK() from inside hardclock(9).
> > > > > > That leads me to wonder if I should initialize all mutexes to 
> > > > > > IPL_SCHED,
> > > > > > possibly blocking clock interrupts, or if we should change the 
> > > > > > mutex API
> > > > > > to allow mtx_enter_try(9) to deal with recursion.
> > > > > > 
> > > > > > The diff below removes the recursion check for mtx_enter_try(9).
> > > > > > 
> > > > > > Comments?  Oks?
> > > > > 
> > > > > My initial reaction is that if you're trying to lock when you already
> > > > > have the lock, there is something wrong with your locking strategy and
> > > > > that this is something we don't want.
> > > > 
> > > > Could you elaborate?  Are you saying that preventing hardclock(9) to run
> > > > is the way to move forward to unlock its internals?  Why isn't that
> > > > strategy wrong?
> > > > 
> > > > In the `windup_mtx' case, does it matter if the mutex is taken by
> > > > another CPU or by myself?  What's the problem when CPU0 is one holding
> > > > the lock?
> > > 
> > > mutex(9) is not and should not become recursive. Recursive locking
> > > works when it is voluntary. If recursion was allowed with interrupts,
> > > the CPU could re-enter the critical section at any moment, possibly
> > > seeing inconsistent state or breaking assumptions made by the original
> > > entry.
> > 
> > I don't understand how your answer is related to my diff :(
> > 
> > Why allowing mtx_enter_try(9) to fail and not to panic(9) when the CPU
> > executing the code already owns the mutex is wrong?
> > 
> > Is that what you call recursion?  Not entering the critical section in
> > the interrupt instead of blocking the interrupt?
> > 
> > How is that different from not entering the critical section because
> > another CPU is holding the mutex?
> 
> To me, it looks very suspicious if a CPU tries to lock a mutex that it
> already holds because it indicates that there is a potential recursion.
> My stance is that such a locking behaviour is a design error in general
> and the current way of checking recursion in mtx_enter_try() is sound.


But why?  Nobody is answering my question :o)  You're saying recursion
is bad because there's recursion.

How is that different from using mtx_enter_try(9) to protect from
another CPU being in the critical section?  Why do you call that a
design error?

I don't mind putting all mutexes at IPL_SCHED.  I would like to
understand why it matters *who* is holding the mutex.  Since
mtx_enter_try(9) is here to not enter the critical section for $reasons,
setting an ipl of IPL_CLOCK sounds just like a special corner case.

You might understand why I'm asking.  We are trying to get more stuff
done in parallel.  That implies separating parts of the kernel logically.
Blocking IPL_CLOCK is a big hammer, it block a lot of stuff.  So yes it
is easy to use, but my feeling is that this simplicity comes at a cost,
latency wise.

Anyway I'll drop this diff.  I'll try to thing about your point :o)

Re: mtx_enter_try(9) & recursion

Reply via email to