Re: [slurm-users] What is an easy way to prevent users run programs on the, master/login node.

David Schanzenbach Thu, 20 May 2021 01:50:33 -0700

For our login nodes (smallish, diskless VMs) we try and limit abuse fromusers through a layered approach as enumerated below.


1. User education

Users of our cluster are required to attend a training that is run byour group. In these sessions we do go over what we do and don't allowon the login nodes and do stress thatwe will kill long running processes if we see it and multiple abusescould get you banned for some duration of time.

2. Set the noexec mount option for any user controlled mountpoint (home,scratch, group/lab/project spaces)

This isn't a perfect solution, as noexec can be worked around if a userunderstands what noexec means. For example, a user wouldn't be able todo "./foo.py", but they could do "python foo.py".We also understand some users have a legitimate reason to use a scripton the login node, but setting noexec doesn't really to prevent the useof scripts, it just to make it a little harder for a user to abuse thelogin node.


3. A small partition with shared nodes with low maxtime

For tasks that are typically longer running (compression/decompression,compilation), outside of just user education, we as have a partitionwith 4 nodes, that limit number of jobs per user (2 jobs running at atime per user) as well as a maxtime of 4 hours. For most of our users,this covers the cases of compilation, testing andcompression/decompression. This set of nodes are also setup to beshared, so users are required to request number of cores and memoryrequired for either a batch job or interactive job to perform longerrunning tasks.

4. For our software modules, we make sure to only expose the modulefiles so the module commands work, but do not expose the path to wherethe compiled software resides.

This prevents users from loading up a module, such as a compiler, andusing it to compile code on our login nodes. If a user can't do theabusive action to begin with, you can't really have a problem. Although,users do sometimes ask us , why the software loaded by a module does notwork on the login node, which we then re-educate the user.

5. Make sure we don't install the development tools (gnu compilers orjdk ) on the login nodes

As we need to allow the use of scp and other transfer tools, we can'tprevent the execution of all software in /bin. As a result, we just tryto minimize what software a user could potentially use to abuse thelogin node with.

A layered approach of education and reducing the potential ways a usercan abuse our login nodes has been working for us for the past couple ofyears. If we do begin to see more login node abuse, we wouldprobably try and layer on the use of cgroups to try and limit memory andcpu usage.



Thanks,
David

Date: Wed, 19 May 2021 19:00:38 +0300
From: Alan Orth <alan.o...@gmail.com>
To: Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk>, Slurm User
Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] What is an easy way to prevent users run
programs on the master/login node.
Message-ID:
<CAKKdN4U460M0mNtS=b_8qsbbpwzkzp+bqnoqdvkih0z_b1z...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Regarding setting limits for users on the head node. We had this foryears:


# CPU time in minutes
* - cpu 30
root - cpu unlimited

But we eventually found that this was even causing long-running jobs like
rsync/scp to fail when users were copying data to the cluster. For a while
I blamed our network people, but then I did some tests and found that it
was the limits that were responsible. I have removed this and other limits

for now but I ruthlessly kill heavy processes that my users run onthere. I

will look into using cgroups on the head node.

Cheers,

On Sat, Apr 24, 2021 at 11:05 AM Ole Holm Nielsen <
ole.h.niel...@fysik.dtu.dk> wrote:

On 24-04-2021 04:37, Crist?bal Navarro wrote:

Hi Community,
I have a set of users still not so familiar with slurm, and yesterday
they bypassed srun/sbatch and just ran their CPU program directly on the
head/login node thinking it would still run on the compute node. I am
aware that I will need to teach them some basic usage, but in the
meanwhile, how have you solved this type of user-behavior problem? Is
there a preffered way to restrict the master/login resources, or
actions, to the regular users ?

We restrict user limits in /etc/security/limits.conf so users can't run
very long or very big tasks on the login nodes:

# Normal user limits
* hard cpu 20
* hard rss 50000000
* hard data 50000000
* soft stack 40000000
* hard stack 50000000
* hard nproc 250

/Ole

Re: [slurm-users] What is an easy way to prevent users run programs on the, master/login node.

Reply via email to