Hi misc@!

If anyone has got some tips about how to debug two hanging machines we have
in our test lab I am eager to learn.

The machines runs 6.5, amd64 and are patched up to 005_libssl using M:Tier's
openup.  Other than that they are rather different, one small Zotac
ZBox-AD02 with AMD E-350 at 1.6 GHz, and one rack mounted Dell PowerEdge
R230 with Intel Xeon E3-1220.

The overall symptoms are that it is possible to switch screens using
Alt+Ctrl+F1..Fn, but when logging in as root the greeting prints but no
prompt.  Alt+Ctrl+Del does not work.  The power button does not work.  I
have to long press the power button to force power off.

This happens during our nightly tests, that are quite resource intesive.

In /var/log/messages I find suspicious entries "/bsd: proc: table is full"
possibly before the machines become inresponsive, but these entries appear
many more times before that point.  And after this "table is full" message
there are many syslog entries; on one machine smartd constatly complains about
an unreadable (pending) sector and atascsi_passthru_done timeout, and on
the other the kernel complains about a probed monitor but no|invalid EDID.

So it seems the machine is out of some resource and fails to spawn a login
shell.  Any clues to how I can find more details and a remedy?  I suspect a
full process table, but wonder how to detect and|or avoid that.

I have considered having systat running on a console screen but do not know
which systat display that might tell me anything.

Best regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB

Reply via email to