Re: [Beowulf] first cluster

2010-07-19 Thread Mark Hahn
It's a very neat idea, but it has the disadvantage - unless I'm misunderstanding - that if the job fails, and leaves droppings in, say, /tmp on the cluster node, the user can't log in to diagnose things or clean up after themselves. my organization has ~4k users (~3-500 active at any time), and

Re: [Beowulf] first cluster

2010-07-19 Thread Reuti
Am 19.07.2010 um 10:54 schrieb Tim Cutts: > > On 16 Jul 2010, at 6:11 pm, Douglas Guptill wrote: > >> On Fri, Jul 16, 2010 at 12:51:49PM -0400, Steve Crusan wrote: >>> We use a PAM module (pam_torque) to stop this behavior. Basically, if you >>> your job isn't currently running on a node, you ca

Re: [Beowulf] first cluster

2010-07-19 Thread Tim Cutts
On 16 Jul 2010, at 6:11 pm, Douglas Guptill wrote: > On Fri, Jul 16, 2010 at 12:51:49PM -0400, Steve Crusan wrote: >> We use a PAM module (pam_torque) to stop this behavior. Basically, if you >> your job isn't currently running on a node, you cannot SSH into a node. >> >> >> http://www.rpmfind.