[slurm-users] Insert separating characters into sacct formated output

2021-02-09 Thread SJTU
Hi, I am using SLURM 19.05.7 . Is it possible to insert user-defined separating characters like "|" or "," into sacct's formatted outputs? That would make it easier to parse fields. Thank you! Jianwen

[slurm-users] Probing CPU and memory usage via seff on running jobs

2020-11-29 Thread SJTU
Hi, Is it possible to probe CPU and memory usage via seff on running jobs? Thank you! Jianwen

[slurm-users] Set a ramdom offset when starting node health check in SLURM

2020-11-26 Thread SJTU
Hi, We uses HealthCheckProgram = /usr/sbin/nhc in slurm to check node health every 600 seconds. However, some NHC checks points to a same central resource thus starting these checks simultaneously may lead to false alarms of service degrade. Is it possible to set a random offset to when

[slurm-users] Raise the priority of a certain kind of jobs

2020-11-12 Thread SJTU
Hello, We want to raise the priority of a certain kind of slurm jobs. We considered doing it in Prolog, but Prolog seems to run only at job starting time so may not be useful for queued jobs. Is there any possible way to do this? Thank you and look forward to your reply. Best, Jianwen

[slurm-users] Limit usage outside reservation

2020-10-20 Thread SJTU
Hi, We reserved compute node resource on SLURM for specific users and hope they will make good use of it. But in some cases users forgot the '--reservation' parameter in job scripts, competing with other users outside the reserved nodes. Is there a recommended way to limit users' usage *OUTSIDE

[slurm-users] How does SLURM calculate StartTime for pending jobs

2020-10-10 Thread SJTU
Hi, `scontrol show jobid xxx` shows SLURM's estimation of StartTime for a pending job. I wonder where I can find the code implementation of StartTime . Thank you! Jianwen

[slurm-users] How to set association factor in Multifactor Priority

2020-09-23 Thread SJTU
Hi, I found that a new "Association Factor" is introduced in 19.05 to be part of Job_priority calculation. Can I set it for each SLURM account so job priority can be differentiated based on job accounts? https://groups.google.com/g/slurm-users/c/nzF8jOPZI_w/m/vj2wkUryBgAJ

[slurm-users] Mocking SLURM to debug job_submit.lua

2020-09-23 Thread SJTU
Hi, Modifying and testing job_submit.lua on a production SLURM system may lead to temporary failure of job submission, which halts new scheduling strategies being applied. Is it possible to mock a SLURM system to debug job_submit.lua so that it can be updated to the production system confident

Re: [slurm-users] [Support] sprio prints an incomplete list of pending jobs

2020-09-16 Thread SJTU
The pending jobs missing in `sprio` output have been set priority manually before. I think that explains why they disappear. Best, Jianwen > On Sep 16, 2020, at 4:00 PM, SJTU wrote: > > Hi, > >I’m using spiro of SLURM 19.05 to inspect job queuing on my cluster. I >

Re: [slurm-users] [Support] SLURM launching jobs onto nodes with suspended jobs may lead to resource contention

2020-09-16 Thread SJTU
d. > And issue the SIGSTOP or SIGCONT. > > Frankly I wish suspend didn't work like this. It should work where it > suspends the job and does not release the cpus but keeps them reserved. > That's the natural understanding of suspend, but that's not the way suspend

[slurm-users] SLURM launching jobs onto nodes with suspended jobs may lead to resource contention

2020-09-16 Thread SJTU
Hi, I am using SLURM 19.05 and found that SLURM may launch jobs onto nodes with suspended jobs, which leads to resource contention after the suspended jobs' restoration. Steps to reproduce this issue are: 1. Launch 40 one-core jobs on a 40-core compute node. 2. Suspend all 40 jobs on that comp

[slurm-users] sprio prints an incomplete list of pending jobs

2020-09-16 Thread SJTU
Hi, I’m using spiro of SLURM 19.05 to inspect job queuing on my cluster. I found sprio prints an incomplete list of pending jobs, much less than ones from `squeue --state=pending` . No extra options seem to be available for sprio. I appreciate any suggestion. Thank you! Jianwen [root@