Package: ruby1.8 Version: 1.8.7.72-3 Severity: important X-Debbugs-cc: debian-ad...@lists.debian.org
Hi there, I've just managed to capture a reasonable strace of a problem that we've been seeing with puppet on several platforms. This is on alpha, but I'm fairly sure we've also seen it on armel and possibly hppa as well. Basically, sleep doesn't sleep. An eventloop with a wakeup timer just ends up in a tight loop, which means our slower machines can't do anything except run a single ruby thread that is supposed to be sleeping. I am assuming this is a bug in the thread/select implementation in ruby, but feel free to reassign if it turns out to be libc/kernel/whatever related. Any bugs in the test code are presumably mine - my ruby is fairly mediocre at best, and my goal was not to write a perfect event loop but to reproduce as faithfully as possible the event loop that puppet uses without actually doing anything useful besides the event loop. Cheers, Test code (I have puppet installed, but a smaller set of libraries is probably sufficient): class SleepTest require 'sync' require "puppet/external/event-loop" include SignalObserver @@syncresources = {} def timer unless defined? @timer @timer = EventLoop::Timer.new( :interval => 3600, :tolerance => 1, :start? => true ) EventLoop.current.monitor_timer @timer end @timer end # Return the sync object associated with a given resource. def sync(resource) @@syncresources[resource] ||= Sync.new return @@syncresources[resource] end def threadlock(resource, type = Sync::EX) self.sync(resource).synchronize(type) do yield end end def newtimer(hash, &block) self.timer() timer = nil threadlock(:timers) do @timers ||= [] timer = EventLoop::Timer.new(hash) @timers << timer if block_given? observe_signal(timer, :alarm, &block) end end # In case they need it for something else. timer end def newthread(&block) @threads ||= [] @threads << Thread.new do yield end end end sleepy = SleepTest.new sleepy.newthread do begin timer = sleepy.timer() EventLoop.monitor_timer timer FileTest.exist?('/bin/sh') end end EventLoop.run Strace (begins with Thread.new call): clone(Process 31149 attached child_stack=0x20000d70b90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x20000d712f0, tls=0x20000d71910, child_tidptr=0x20000d712f0) = 31149 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x2000054a2c0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x2ac1) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0xb91 <unfinished ...> [pid 31149] rt_sigprocmask(SIG_BLOCK, ~[RT_0 RT_1], NULL, 8) = 0 [pid 31149] nanosleep({0, 10000000}, <unfinished ...> [pid 31148] <... osf_sigprocmask resumed> ) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x14e9) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x1c11) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0xb91) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x20000549ed8) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x20000549e88) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0 [pid 31149] nanosleep({0, 10000000}, <unfinished ...> [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x20000549e38) = 0 [pid 31148] brk(0x1200e2000) = 0x1200e2000 [pid 31148] brk(0x1200de000) = 0x1200de000 [pid 31148] osf_sigprocmask(0x1, 0, 0x12007aaf0) = 0 [pid 31148] sigreturn() = ? (mask now []) [pid 31148] osf_syscall(0x1, 0, 0xfd9, 0, 0x1, 0x11fa0a720) = -1 EADDRINUSE (Address already in use) [pid 31148] osf_syscall(0x1, 0, 0x14e9, 0, 0, 0) = -1 EADDRINUSE (Address already in use) [pid 31148] osf_syscall(0x1, 0, 0x1c11, 0, 0, 0) = -1 EADDRINUSE (Address already in use) [pid 31148] osf_syscall(0x1, 0, 0xb91, 0, 0, 0) = 48 [pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = -1 EADDRINUSE (Address already in use) [pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = 48 [pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0 [pid 31149] nanosleep({0, 10000000}, <unfinished ...> [pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = 48 [pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = 48 [pid 31148] osf_syscall(0x1, 0, 0, 0, 0x2, 0) = 48 [pid 31148] osf_syscall(0x1, 0, 0x20000549b18, 0, 0x2, 0) = 48 [pid 31148] osf_syscall(0x1, 0, 0x12009e270, 0, 0x2, 0) = 103 [pid 31148] pipe([512, 530563912]) = 3 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] fcntl(3, F_GETFL) = 0 (flags O_RDONLY) [pid 31148] fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 31148] mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x20000026000 [pid 31148] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] fcntl(4, F_GETFL) = 0x1 (flags O_WRONLY) [pid 31148] fstat64(4, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 31148] mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x20000028000 [pid 31148] lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 31148] fcntl(3, F_SETFD, FD_CLOEXEC) = 0 [pid 31148] fcntl(4, F_SETFD, FD_CLOEXEC) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x2dc4) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x1) = 0 [pid 31148] stat64("/usr/local/lib/site_ruby/1.8/fcntl.rb", <unfinished ...> [pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0 [pid 31149] nanosleep({0, 10000000}, <unfinished ...> [pid 31148] <... stat64 resumed> 0x11f9fbfb0) = -1 ENOENT (No such file or directory) [pid 31148] stat64("/usr/local/lib/site_ruby/1.8/fcntl.so", 0x11f9fbfb0) = -1 ENOENT (No such file or directory) [pid 31148] stat64("/usr/local/lib/ = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] fork() = -1 ERRNO_425 (Unknown error 425) [pid 31148] osf_syscall(0x2000014a280, 0x11f9fbf90, 0x11f9fbf90, 0, 0x11f9fca50, 0x11f9fca50) = -1 EOPNOTSUPP (Operation not supported) [pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0 [pid 31148] osf_wait4(2199024607872, <unfinished ...> [pid 31149] nanosleep({0, 10000000}, <unfinished ...> [pid 31148] <... osf_wait4 resumed> NULL, WEXITED|WSTOPPED|0x1b0, NULL) = 6 [pid 31148] osf_syscall(0x7, 0, 0, 0, 0x8, 0x1) = 48 [pid 31148] osf_syscall(0x1, 0, 0, 0, 0x8, 0x1) = 103 [pid 31148] fcntl(6, F_GETFL) = 0x1 (flags O_WRONLY) [pid 31148] fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x2dc9) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x2e97) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x2e89) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x2df7) = 0 [pid 31148] osf_sigprocmask(0x1, 0, 0x2dff) = 0 [pid 31148] select(8, [], [6], [], {0, 0}) = 1 (out [6], left {0, 0}) [pid 31148] osf_sigprocmask(0x1, 0, 0x120072b20) = 0 [pid 31148] sigreturn() = ? (mask now [FPE BUS SYS ALRM URG TSTP USR1]) [pid 31148] exit(4) = ? [pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0 [pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...> [pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- [pid 31149] <... tgkill resumed> ) = 0 [pid 31149] nanosleep({0, 10000000}, _exit returned! ) = ? (mask now [FPE BUS SYS ALRM URG TSTP USR1]) {4656511788077862093, 0}) = 0 [pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...> [pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- [pid 31149] <... tgkill resumed> ) = 0 [pid 31148] sigreturn( <unfinished ...> [pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0 [pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...> [pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- [pid 31149] <... tgkill resumed> ) = 0 [pid 31148] <... sigreturn resumed> ) = ? (mask now []) [pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0 [pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...> [pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- [pid 31149] <... tgkill resumed> ) = 0 [pid 31148] sigreturn( <unfinished ...> [pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0 [pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...> [pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- [pid 31149] <... tgkill resumed> ) = 0 [pid 31148] <... sigreturn resumed> ) = ? (mask now []) [pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0 [pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...> [pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- [pid 31149] <... tgkill resumed> ) = 0 [pid 31148] sigreturn( <unfinished ...> (loops until timer expires from here on) -- System Information: Debian Release: 5.0.2 APT prefers stable APT policy: (500, 'stable') Architecture: alpha Kernel: Linux 2.6.26-2-alpha-generic Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.utf8) Shell: /bin/sh linked to /bin/bash Versions of packages ruby1.8 depends on: ii libc6.1 2.7-18 GNU C Library: Shared libraries ii libruby1.8 1.8.7.72-3 Libraries necessary to run Ruby 1. ruby1.8 recommends no packages. Versions of packages ruby1.8 suggests: pn rdoc1.8 <none> (no description available) pn ri1.8 <none> (no description available) pn ruby1.8-examples <none> (no description available) -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org