Package: ruby1.8
Version: 1.8.7.72-3
Severity: important
X-Debbugs-cc: debian-ad...@lists.debian.org

Hi there,

I've just managed to capture a reasonable strace of a problem that we've
been seeing with puppet on several platforms.  This is on alpha, but I'm
fairly sure we've also seen it on armel and possibly hppa as well.
Basically, sleep doesn't sleep.  An eventloop with a wakeup timer just
ends up in a tight loop, which means our slower machines can't do
anything except run a single ruby thread that is supposed to be
sleeping.

I am assuming this is a bug in the thread/select implementation in ruby,
but feel free to reassign if it turns out to be libc/kernel/whatever
related.

Any bugs in the test code are presumably mine - my ruby is fairly mediocre
at best, and my goal was not to write a perfect event loop but to
reproduce as faithfully as possible the event loop that puppet uses
without actually doing anything useful besides the event loop.

Cheers,

Test code (I have puppet installed, but a smaller set of libraries is
probably sufficient):

class SleepTest
    require 'sync'
    require "puppet/external/event-loop"
    include SignalObserver

    @@syncresources = {}

    def timer
        unless defined? @timer
            @timer = EventLoop::Timer.new(
                :interval => 3600,
                :tolerance => 1,
                :start? => true
            )
            EventLoop.current.monitor_timer @timer
        end
        @timer
    end

    # Return the sync object associated with a given resource.
    def sync(resource)
        @@syncresources[resource] ||= Sync.new
        return @@syncresources[resource]
    end

    def threadlock(resource, type = Sync::EX)
        self.sync(resource).synchronize(type) do
            yield
        end
    end

    def newtimer(hash, &block)
        self.timer()
        timer = nil
        threadlock(:timers) do
            @timers ||= []
            timer = EventLoop::Timer.new(hash)
            @timers << timer
    
            if block_given?
                observe_signal(timer, :alarm, &block)
            end
        end
    
        # In case they need it for something else.
        timer
    end

    def newthread(&block)
        @threads ||= []

        @threads << Thread.new do
            yield
        end
    end

end

sleepy = SleepTest.new
sleepy.newthread do
    begin
        timer = sleepy.timer()
        EventLoop.monitor_timer timer
        FileTest.exist?('/bin/sh')
    end
end
EventLoop.run

Strace (begins with Thread.new call):

clone(Process 31149 attached
child_stack=0x20000d70b90, 
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
 parent_tidptr=0x20000d712f0, tls=0x20000d71910, child_tidptr=0x20000d712f0) = 
31149
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2000054a2c0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2ac1) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0xb91 <unfinished ...>
[pid 31149] rt_sigprocmask(SIG_BLOCK, ~[RT_0 RT_1], NULL, 8) = 0
[pid 31149] nanosleep({0, 10000000},  <unfinished ...>
[pid 31148] <... osf_sigprocmask resumed> ) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x14e9) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x1c11) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0xb91) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x20000549ed8) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x20000549e88) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31149] nanosleep({0, 10000000},  <unfinished ...>
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x20000549e38) = 0
[pid 31148] brk(0x1200e2000)            = 0x1200e2000
[pid 31148] brk(0x1200de000)            = 0x1200de000
[pid 31148] osf_sigprocmask(0x1, 0, 0x12007aaf0) = 0
[pid 31148] sigreturn()                 = ? (mask now [])
[pid 31148] osf_syscall(0x1, 0, 0xfd9, 0, 0x1, 0x11fa0a720) = -1 EADDRINUSE 
(Address already in use)
[pid 31148] osf_syscall(0x1, 0, 0x14e9, 0, 0, 0) = -1 EADDRINUSE (Address 
already in use)
[pid 31148] osf_syscall(0x1, 0, 0x1c11, 0, 0, 0) = -1 EADDRINUSE (Address 
already in use)
[pid 31148] osf_syscall(0x1, 0, 0xb91, 0, 0, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = -1 EADDRINUSE (Address already in 
use)
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = 48
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31149] nanosleep({0, 10000000},  <unfinished ...>
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0x2, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0x20000549b18, 0, 0x2, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0x12009e270, 0, 0x2, 0) = 103
[pid 31148] pipe([512, 530563912])      = 3
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] fcntl(3, F_GETFL)           = 0 (flags O_RDONLY)
[pid 31148] fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
[pid 31148] mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
-1, 0) = 0x20000026000
[pid 31148] lseek(3, 0, SEEK_CUR)       = -1 ESPIPE (Illegal seek)
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] fcntl(4, F_GETFL)           = 0x1 (flags O_WRONLY)
[pid 31148] fstat64(4, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
[pid 31148] mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
-1, 0) = 0x20000028000
[pid 31148] lseek(4, 0, SEEK_CUR)       = -1 ESPIPE (Illegal seek)
[pid 31148] fcntl(3, F_SETFD, FD_CLOEXEC) = 0
[pid 31148] fcntl(4, F_SETFD, FD_CLOEXEC) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2dc4) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x1) = 0
[pid 31148] stat64("/usr/local/lib/site_ruby/1.8/fcntl.rb",  <unfinished ...>
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31149] nanosleep({0, 10000000},  <unfinished ...>
[pid 31148] <... stat64 resumed> 0x11f9fbfb0) = -1 ENOENT (No such file or 
directory)
[pid 31148] stat64("/usr/local/lib/site_ruby/1.8/fcntl.so", 0x11f9fbfb0) = -1 
ENOENT (No such file or directory)
[pid 31148] stat64("/usr/local/lib/               = -1 ERRNO_425 (Unknown error 
425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork()                      = -1 ERRNO_425 (Unknown error 425)
[pid 31148] osf_syscall(0x2000014a280, 0x11f9fbf90, 0x11f9fbf90, 0, 
0x11f9fca50, 0x11f9fca50) = -1 EOPNOTSUPP (Operation not supported)
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31148] osf_wait4(2199024607872,  <unfinished ...>
[pid 31149] nanosleep({0, 10000000},  <unfinished ...>
[pid 31148] <... osf_wait4 resumed> NULL, WEXITED|WSTOPPED|0x1b0, NULL) = 6
[pid 31148] osf_syscall(0x7, 0, 0, 0, 0x8, 0x1) = 48
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0x8, 0x1) = 103
[pid 31148] fcntl(6, F_GETFL)           = 0x1 (flags O_WRONLY)
[pid 31148] fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2dc9) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0)  = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2e97) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2e89) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2df7) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2dff) = 0
[pid 31148] select(8, [], [6], [], {0, 0}) = 1 (out [6], left {0, 0})
[pid 31148] osf_sigprocmask(0x1, 0, 0x120072b20) = 0
[pid 31148] sigreturn()                 = ? (mask now [FPE BUS SYS ALRM URG 
TSTP USR1])
[pid 31148] exit(4)                     = ?
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> )      = 0
[pid 31149] nanosleep({0, 10000000}, _exit returned!
)  = ? (mask now [FPE BUS SYS ALRM URG TSTP USR1])
{4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> )      = 0
[pid 31148] sigreturn( <unfinished ...>
[pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> )      = 0
[pid 31148] <... sigreturn resumed> )   = ? (mask now [])
[pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> )      = 0
[pid 31148] sigreturn( <unfinished ...>
[pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> )      = 0
[pid 31148] <... sigreturn resumed> )   = ? (mask now [])
[pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> )      = 0
[pid 31148] sigreturn( <unfinished ...>

(loops until timer expires from here on)

-- System Information:
Debian Release: 5.0.2
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: alpha

Kernel: Linux 2.6.26-2-alpha-generic
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8) (ignored: LC_ALL 
set to en_US.utf8)
Shell: /bin/sh linked to /bin/bash

Versions of packages ruby1.8 depends on:
ii  libc6.1                       2.7-18     GNU C Library: Shared libraries
ii  libruby1.8                    1.8.7.72-3 Libraries necessary to run Ruby 1.

ruby1.8 recommends no packages.

Versions of packages ruby1.8 suggests:
pn  rdoc1.8                       <none>     (no description available)
pn  ri1.8                         <none>     (no description available)
pn  ruby1.8-examples              <none>     (no description available)

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to