Hi Chris,

You mentioned “But trials using this do not seem to be fruitful so far.” . . 
why?

In our job_submit.lua there is:

    if job_desc.shared == 0 then
      slurm.user_msg("exclusive access is not permitted with GPU jobs.")
      slurm.user_msg("Remove '--exclusive' from your job submission script")
      return ESLURM_NOT_SUPPORTED
    end

and testing:

$ srun --exclusive --time 00:10:00 --gres gpu:1 --pty /bin/bash -i
srun: error: exclusive access is not permitted with GPU jobs.
srun: error: Remove '--exclusive' from your job submission script
srun: error: Unable to allocate resources: Requested operation is presently 
disabled

In slurm.h the job_descriptor struct has:

        uint16_t shared;        /* 2 if the job can only share nodes with other
                                 *   jobs owned by that user,
                                 * 1 if job can share nodes with other jobs,
                                 * 0 if job needs exclusive access to the node,
                                 * or NO_VAL to accept the system default.
                                 * SHARED_FORCE to eliminate user control. */

If there’s a case where using “.shared” isn’t working please let us know.

   -Greg


From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of 
Christopher Benjamin Coffey <chris.cof...@nau.edu>
Date: Saturday, 19 February 2022 at 3:17 am
To: slurm-users <slurm-users@lists.schedmd.com>
Subject: [EXTERNAL] [slurm-users] Can job submit plugin detect "--exclusive" ?
Hello!

The job_submit plugin doesn't appear to have a way to detect whether a user 
requested "--exclusive". Can someone confirm this? Going through the code: 
src/plugins/job_submit/lua/job_submit_lua.c I don't see anything related. 
Potentially "shared" could be possible in some way. But trials using this do 
not seem to be fruitful so far.

If a user requests --exclusive, I'd like to append "--exclude=<nodes>" on to 
their job request to keep them off of certain nodes. For instance, we have our 
gpu nodes in a default partition with a high priority so that jobs don't land 
on them until last. And this is the same for our highmem nodes. Normally this 
works fine, but if someone asks for "--exclusive" this will land on these nodes 
quite often unfortunately.

Any ideas? Of course, I could take these nodes out of the partition, yet I'd 
like to see if something like this would be possible.

Thanks! :)

Best,
Chris

--
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167


Reply via email to