You can talk to your admin about becoming a Torque manager which
is configurable (check the docs). Then you can do something like:
'pbsnodes -o nodexxx'. This takes node "nodexxx" offline so that
it is not used until its status is cleared with 'pbsnodes -c nodexxx'
(presumably after fixing the node in question).
Also, you can request nodes by name in the queue submission script
like so:
#PBS -l nodes=node010:ppn=2+node002:ppn=1
This would request two processors on node "node010" and one on "node002".
Cumbersome, but useful in a bind. I don't know of a way offhand of
requesting any node _except_ a particular node.
- John
On Sun, 26 Mar 2006, James Rustad wrote:
Guys
This is a strange question, but
Is there any way to disable a bad node in PBS without being the system
administrator?
I am lining up about 50 jobs in the queue and they fail sequentially when
they hit
the bad node. This often seems to happen on the weekends when nobody
is around to reboot the node.
Can I specify within PBS "don't use node015" or something like that.
Thanks
Jim Rustad
ps
I may be using TORQUE rather than PBS, by the way
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf