Hi, Am 21.10.2011 um 15:10 schrieb Prentice Bisbal:
> Beowulfers, > > I have a question that isn't directly related to clusters, but I suspect > it's an issue many of you are dealing with are dealt with: users using > the screen command to stay logged in on systems and running long jobs > that they forget about. Have any of you experienced this, and how did > you deal with it? > > Here's my scenario: > > In addition to my cluster, we have a bunch of "computer servers" where > users can run the programs. These are "large" boxes with more cores > (24-32 cores) and more RAM (128 - 256 GB, ECC) than they'd have on a > desktop top. > > Periodically, when I have to shutdown/reboot a system for maintenance, > I find a LOT of shells being run through the screen command for users > who aren't logged in. The majority are idle shells, but many are running > jobs, that seem to be forgotten about. For example, I recently found > some jobs running since July or August that were running under the > account of someone who hasn't even been here for months! > > My opinion is these these are shared resources, and if you aren't > interactively using them, you should log out to free up resources for > others. If you have a job that can be run non-interactively, you should > submit it to the cluster. > > Has anyone else here dealt with the problem? > > I would like to remove screen from my environment entirely to prevent > this. My fellow sysadmins here agree. I'm expecting massive backlash > from the users. I disallow rsh to the machines and limit ssh to admin staff. Users who want to run something on a machine have to go through the queuing system to get access to a node granted by GridEngine (for the startup method you can use either the -builtin- or [in case you need X11 forwarding] by a different sshd_config and ssh [GridEngine will start one daemon per task], one additional step is necessary for a tight integration of ssh). For users just checking their jobs on a node I have a dedicated queue (where they can login always, but h_cpu limited to 60 seconds, i.e. they can't abuse it). -- Reuti _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf