[slurm-users] Re: FairShare if there's only one account?

2024-08-10 Thread Ryan Cox via slurm-users
fairshare=parent sets the user association to effectively compete at the account level, so this is behaving as intended.  It's effectively ignoring the users' usage when competing with others inside the same account.  That is not want you want.  Give them all the same numeric value, not parent.

[slurm-users] Re: Node (anti?) Feature / attribute

2024-06-14 Thread Ryan Cox via slurm-users
We did something like this in the past but from C.  However, modifying the features was painful if the user did any interesting syntax. What we are doing now is using --extra for that purpose.  The nodes boot up with SLURMD_OPTIONS="--extra {\\\"os\\\":\\\"rhel9\\\"}" or similar.  Users can re

[slurm-users] Re: memory high water mark reporting

2024-05-20 Thread Ryan Cox via slurm-users
causes a lot of overhead and in any case this seems to not be a sensible way to get values that should just be determined right at the end by an event rather than using polling. Many thanks, Emyr -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubsc

[slurm-users] Re: Trouble Running Slurm C Extension Plugin

2024-04-09 Thread Ryan Cox via slurm-users
      fprintf(fp,"Hello!");         fclose(fp);         return SLURM_SUCCESS; } int job_modify(job_desc_msg_t *job_desc, job_record_t *job_ptr,                uint32_t submit_uid, char **err_msg) {         return SLURM_SUCCESS; } -- Ryan Cox Director Office of Research Computing Brig

Re: [slurm-users] Multifactor fair-share with single account

2024-01-04 Thread Ryan Cox
On 1/4/24 02:41, Kamil Wilczek wrote: W dniu 4.01.2024 o 07:56, Loris Bennett pisze: Hi Kamil, Kamil Wilczek writes: Dear All, I have a question regarding the fair-share factor of the multifactor priority algorithm. My current understanding is that the fair-share makes sure that differe

Re: [slurm-users] Transport from SLC to Provo?

2023-08-14 Thread Ryan Cox
but apparently both bus and train have stopped running by that time on Sundays. Does anyone know about any alternative way to get to Provo on a Sunday night? -- Ryan Cox Director Office of Research Computing Brigham Young University

Re: [slurm-users] sbatch - accept jobs above limits

2022-02-09 Thread Ryan Cox
guration is not available These jobs should be accepted, if a suitable node will be active soon. For example, these jobs could be in PartitionConfig. Is that configurable? Many thanks, Mike -- Ryan Cox Director Office of Research Computing Brigham Young University

Re: [slurm-users] How to limit # of execution slots for a given node

2022-01-07 Thread Ryan Cox
David, There are several possible answers depending on what you hope to accomplish.  What exactly is the issue that you're trying to solve? Do you mean that you have users who need, say, 8 GB of RAM per core but you only have 4 GB of RAM per core on the system and you want a way to account fo

Re: [slurm-users] job_container.conf:: how to adopt a autofs base mount point

2021-12-02 Thread Ryan Cox
t, at any point before the actual job, if a apply a --make-rshared on /cvmfs, autofs when will mount something within will reset this attribute. is there a way to tell slurmstepd to somehow adopt and keep this mountpoint no matter what is mounted within? Thank you! Adrian -- Ryan Cox Dir

Re: [slurm-users] How to avoid a feature?

2021-07-01 Thread Ryan Cox
ideas how to do that? Submit LUA perhaps? Brian Andrus -- Ryan Cox Director Office of Research Computing Brigham Young University

Re: [slurm-users] Exposing only requested CPUs to a job on a given node.

2021-05-14 Thread Ryan Cox
processing.c pu_count()))     for i in range(multiprocessing.cpu_count()):         p = multiprocessing.Process(target=worker, name=i).star t() Thanks, -- Luis R. Torres -- Ryan Cox Director Office of Research Computing Brigham Young University

Re: [slurm-users] FairShare

2020-12-02 Thread Ryan Cox
                  xping            1    0.050000           0    0.00      0.00 1.9833e+24 0.090909 -- Ryan Cox Director Office of Research Computing Brigham Young University

Re: [slurm-users] FairShare

2020-12-02 Thread Ryan Cox
the rightmost side. *From:* slurm-users on behalf of Ryan Cox *Sent:* Wednesday, December 2, 2020 10:31 AM *To:* Slurm User Community List ; Micheal Krombopulous *Subject:* Re: [slurm-users] FairShare It's really s

Re: [slurm-users] FairShare

2020-12-02 Thread Ryan Cox
It's really similar to a binary search tree.  Within each account, it is Shares / Usage to calculate the Level FS.  See https://slurm.schedmd.com/SUG14/fair_tree.pdf has more details, starting at page 34 or so.  It even has an "animation". Ryan On 12/2/20 10:22 AM, Micheal Krombopulous wrote:

Re: [slurm-users] FairShare

2020-12-02 Thread Ryan Cox
Micheal, Details are at https://slurm.schedmd.com/fair_tree.html . If they have the same shares and usage as each other, they will have the same fair share value.  One thing to keep in mind is that sshare rounds or truncates the values, so 0.00 do

Re: [slurm-users] Nodes going into drain because of "Kill task failed"

2020-07-22 Thread Ryan Cox
troubleshooting. Regards, Angelos (Sent from mobile, please pardon me for typos and cursoriness.) 2020/07/23 0:41、Ryan Cox のメール:  Ivan, Are you having I/O slowness? That is the most common cause for us. If it's not that, you'll want to look through all the reasons that it takes a l

Re: [slurm-users] Nodes going into drain because of "Kill task failed"

2020-07-22 Thread Ryan Cox
Ivan, Are you having I/O slowness? That is the most common cause for us. If it's not that, you'll want to look through all the reasons that it takes a long time for a process to actually die after a SIGKILL because one of those is the likely cause. Typically it's because the process is waiting

Re: [slurm-users] Is that possible to submit jobs to a Slurm cluster right from a developer's PC

2019-12-12 Thread Ryan Cox
Be careful with this approach.  You also need the same munge key installed everywhere.  If the developers have root on their own system, they can submit jobs and run Slurm commands as any user. ssh sounds significantly safer.  A quick and easy way to make sure that users don't abuse the system

Re: [slurm-users] can't get fairshare to be calculated per partition

2019-10-30 Thread Ryan Cox
Fairshare is calculated based on an "association".  If you look in the manpage for sacctmgr under ENTITIES, you will see:    association   The  entity  used  to  group information consisting of four parameters: account, cluster, partition (optional), and user. Users can have en

Re: [slurm-users] SGE to Slurm functionality not supported

2018-08-30 Thread Ryan Cox
sbatch --wrap="command --args" is similar to what you're looking for. Ryan On 08/30/2018 09:12 AM, Anson Abraham wrote: In Sun Grid Engine, there's an option (parameter) of -b "Gives the user the possibility to indicate explicitly whether command should be treated as binary or script. If the v

Re: [slurm-users] What's the best way to suppress core dump files from jobs?

2018-03-21 Thread Ryan Cox
guess this counts now as being documented in a public place! Obviously, UsePAM and the /etc/pam.d/slurm rules ought to be documented clearly somewhere, but I'm not aware of any good description. /Ole -- Ryan Cox Operations Director Fulton Supercomputing Lab Brigham Young University

Re: [slurm-users] Disable Account Limits Per Partition?

2018-02-22 Thread Ryan Cox
unning as well. This doesn't work well with how we grant hours and track things, but certainly will do this if it's my only option. Any advice? -- Thanks. John -- Ryan Cox Operations Director Fulton Supercomputing Lab Brigham Young University

Re: [slurm-users] How to deal with user running stuff in frontend node?

2018-02-15 Thread Ryan Cox
k so elegant as other users can perform the sabe abuse on the future, and he should also be able to run low cpu-consuming jobs for a longer period. However I am not an experienced sysadmin so I am completely open to suggestions or different ways of facing this issue. Any thoughts? chee