Just 3 more things to think about re where ssh should be enabled or disabled.
1. Many people's job scripts use ssh either directly (to say clean up /tmp) or indirectly from mpirun. (good mpirun's use the batch engine's per-node daemon to launch the binaries not ssh). Hence simple disabling of ssh will break many batch scripts. 2. Some 'batch jobs' are actually interactive sessions - eg salloc in SLURM for interactive debugging sessions. 3. If a user has a a set of say 32 nodes allocated to them and in use for one of their batch jobs, it is reasonable to allow them interactive access to those compute nodes - eg for profiling, debugging or computational steering. Daniel Daniel Kidger Bull Information Systems, UK On 25 July 2013 06:08, Christopher Samuel <sam...@unimelb.edu.au> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 25/07/13 14:40, Mark Hahn wrote: > > > do you really find users who decide to choose their own nodes? > > In the past yes, they've come from places who either haven't had a > queuing system or who haven't use HPC before and haven't read the docs > or been to the courses. > > > limiting ssh access, done right, can permit (c) and prevent (a). > > That's what we do. Users can login to nodes their jobs are on. I'm > hoping that the aims of the Slurm PAM module to be able to move users > SSHing into the node into the cgroup for their jobs will get > implemented. That way if they do login and run stuff that impacts > they'll only hurt their own jobs. > > > we don't really see (a) enough to worry about it (we're pretty big > > on at least basic user inculcation...) and most of (b) I see is > > actually not helped, since the rogue jobs are usually escapees, > > rather than mis-aimed. > > Yeah, we see rogue jobs and have health check scripts that can fix > them up for the simple cases (and alert us and take the node offline > for others). That helps with having to deal with the emails from > users asking why their jobs are running slower than usual. > > > I suppose you could charge by utime+stime rather than real time. > > That would mean a lot of extra hacking around as we're using Gold > (with Torque and Moab) at the moment and will be moving to Slurm in > the very near future (as it's what we run on our BG/Q), so we bend to > their whim on charging. > > cheers! > Chris > - -- > Christopher Samuel Senior Systems Administrator > VLSCI - Victorian Life Sciences Computation Initiative > Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 > http://www.vlsci.org.au/ http://twitter.com/vlsci > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iEYEARECAAYFAlHwss4ACgkQO2KABBYQAh8UywCgiFnVHUxTCAF8DPQkdMQCutD8 > PuEAnRz91qSEQM1mfwZfBV7CsoVjZLk/ > =+JDY > -----END PGP SIGNATURE----- > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf >
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf