Re: nanosleep(2): don't tsleep(9) if timeout is empty

Scott Cheloha Thu, 27 Dec 2018 08:42:05 -0800

> On Dec 26, 2018, at 6:51 AM, Mark Kettenis <mark.kette...@xs4all.nl> wrote:
> 
>> Date: Tue, 25 Dec 2018 23:11:45 -0500
>> From: Scott Cheloha <scottchel...@gmail.com>
>> 
>> On Tue, Dec 25, 2018 at 11:24:54PM +0100, Mark Kettenis wrote:
>>> 
>>> [...]
>>> 
>>> Time to beat you over the head with POSIX ;)
>> 
>> Oh joy!
>> 
>> All I *really* wanted for Christmas was an interpretation of a standards
>> document :)
>> 
>>>   The nanosleep() function shall cause the current thread to be
>>>   suspended from execution until...
>>> 
>>> Which can be interpreted such that nanosleep() should always suspend
>>> the thread (unless an error is returned) even if te interval is 0ns.
>> 
>> Hmm... I guess that depends on whether the "current thread of execution"
>> includes stuff running in the kernel.  As far as the application is
>> concerned, all bets are off when we leave the application code, no?  POSIX
>> says in other places that execution is "suspended" during a system call,
>> but in many of those cases I'm sure the typical implementation is still
>> executing code.
>> 
>> This interpretation of "suspends" would also exclude (perhaps naive) wakeup
>> latency optimizations like busy-waiting with delay(9) (or an equivalent)
>> in order to time out as close to the caller's request as possible.  I'm
>> pretty sure this is not unheard of, though I'd have to dig a little deeper
>> to find examples on other systems.
> 
> Yes, that is exactly my interpretation of "suspends" and pretty much
> the root of my concern about your diff.


Okay.  I mean, returning quickly on an empty timeout is far cry from the
stuff I'm probing about here.  Just exploring what is fair game.

> In the end OpenBSD is *not* an RT OS.

Ack.

> Any expectation that nanosleep(2) can be used for precise timing needs
> to be fought with fire ;).
> 
> That doesn't mean we shouldn't consider improving the way we handle
> timeouts to be more accurate of course.

In that case... I have another diff you might like better; more later in
another thread.

>>> I think you need a more convincing argument to change this (actual
>>> code that misbehaves with the current implementation) especially since
>>> your diff adds complexity.
>> 
>> That's fair, I will keep an ear to the ground for spinning applications.
>> 
>> I might have a winner already, though.  Earlier today I saw iridium busy
>> sleeping: repeat nanosleeps with zero'd timespec structs.  I have a ktrace
>> here, but I have yet to reproduce.  Need to fuss with it a bit more.
>> 
>> fwiw, here's an exerpt:
>> 
>> 90240/241378  iridium  1545768212.972668 CALL  nanosleep(0x84928d42a78,0)
>> 90240/123819  iridium  1545768212.972670 CALL  nanosleep(0x84928d42a78,0)
>> 90240/241378  iridium  1545768212.972703 STRU  struct timespec { 0 }
>> 90240/123819  iridium  1545768212.972706 STRU  struct timespec { 0 }
>> 90240/169304  iridium  1545768212.983015 CALL  nanosleep(0x84928d42a78,0)
>> 90240/169304  iridium  1545768212.983095 STRU  struct timespec { 0 }
>> 90240/525129  iridium  1545768212.983100 CALL  nanosleep(0x84928d42a78,0)
>> 90240/525129  iridium  1545768212.984049 STRU  struct timespec { 0 }
>> 90240/241378  iridium  1545768212.993270 CALL  nanosleep(0x84928d42a78,0)
>> 90240/241378  iridium  1545768212.993351 STRU  struct timespec { 0 }
>> 90240/123819  iridium  1545768212.997352 CALL  nanosleep(0x84928d42a78,0)
>> 90240/123819  iridium  1545768212.997431 STRU  struct timespec { 0 }
>> 90240/558012  iridium  1545768212.998511 CALL  nanosleep(0x84928d42a78,0)
>> 90240/558012  iridium  1545768212.998516 STRU  struct timespec { 0 }
>> 
>> ... in my 13 second trace there are ~3000 zero'd nanosleep calls across
>> several threads.  It went like that, eating up 1.5 CPUs worth of time until
>> I killed it... ugh, chrome...
> 
> Would be interesting to figure out where exactly in the code this
> happens.  But given the size (and quality) of the code base that is by
> no means a trivial thing.

Would you believe it was idling on about:blank when it started?

> But in view of this observation I'm inclined to think the current
> behaviour of nanosleep(2) is preferable.

Okay, I will pursue other angles.

Re: nanosleep(2): don't tsleep(9) if timeout is empty

Reply via email to