On 18/05/18 20:04 +0000, Shobe, Casey wrote:
> On a couple clusters that have been running for a little while
> (without fencing), I'm seeing runaway server.rb processes using 100%
> of a single CPU core each.
> 
> When I look at ps, I can see that these have something to do with
> pcsd:
> 
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root      6103  0.0  0.3 1076744 59200 ?       Ssl  Apr06  59:09 
> /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > 
> /dev/null &
> root     17548 99.3  0.2 873648 46308 ?        Rl   Apr18 43356:57  \_ 
> /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > 
> /dev/null &
> root     16688 98.9  0.3 941160 49472 ?        Rl   May01 24300:52  \_ 
> /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > 
> /dev/null &
> root      6009 98.8  0.3 942188 49688 ?        R    May02 22607:08  \_ 
> /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > 
> /dev/null &
> root     15556 98.8  0.3 1076344 51836 ?       R    May03 21410:12  \_ 
> /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > 
> /dev/null &
> 
> Running strace on one of the processes shows that they are looping
> on sched_yield().

Can you share some HW specs with us, at least the architecture
to start with -- x86_64=amd64, arm (gen/mode?), something else?

The suspicion here is that just the first one may be sufficiently
free from code porting glitches, I mean at the Ruby interpreter
level or lower.

-- 
Jan (Poki)

Attachment: pgpS6uciNBQiB.pgp
Description: PGP signature

_______________________________________________
Users mailing list: [email protected]
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to