from:"Michael DiDomenico"

[slurm-users] Re: Job running slower when using Slurm

2025-04-23 Thread Michael DiDomenico via slurm-users

the program probably says 32 threads, because it's just looking at the box, not what slurm cgroups allow (assuming your using them) for cpu i think for an openmp program (not openmpi) you definitely want the first command with --cpus-per-task=32 are you measuring the runtime inside the program or

[slurm-users] Re: Job running slower when using Slurm

2025-04-23 Thread Michael DiDomenico via slurm-users

without knowing anything about your environment, its reasonable to suspect that maybe your openmp program is multi-threaded, but slurm is constraining your job to a single core. evidence of this should show up when running top on the node, watching the cpu% used for the program On Wed, Apr 23, 20

[slurm-users] Re: Job not starting

2024-12-10 Thread Michael DiDomenico via slurm-users

you don't need to be a subscriber to search bugs.schedmd.com On Tue, Dec 10, 2024 at 9:44 AM Davide DelVento via slurm-users wrote: > > Good sleuthing. > > It would be nice if Slurm would say something like > Reason=Priority_Lower_Than_Job_ so people will immediately find the > culprit in s

[slurm-users] sbatch and --nodes

2024-05-31 Thread Michael DiDomenico via slurm-users

its friday and i'm either doing something silly or have a misconfig somewhere, i can't figure out which when i run sbatch --nodes=1 --cpus-per-task=1 --array=1-100 --output test_%A_%a.txt --wrap 'uname -n' sbatch doesn't seem to be adhering to the --nodes param. when i look at my output files i

Re: [slurm-users] sacct runtime performance varies on job status codes

2023-09-01 Thread Michael DiDomenico

i can't directly answer you're question, but i suspect there's a missing index somewhere. what i would do is turn on the mysql query log and look at the sql and explain plan associated. it's also possible that since you're a few rev's behind it's already been fixed in a later version, so you coul

Re: [slurm-users] stopping job array after N failed jobs in row

2023-08-02 Thread Michael DiDomenico

On Tue, Aug 1, 2023 at 3:27 PM Daniel Letai wrote: > The other OTHER approach might be to use some epilog (or possibly > epilogslurmctld) to log exit codes for first 20 tasks in each array, and > cancel the array if non-zero. This is a global approach which will affect all > job arrays, so migh

Re: [slurm-users] slurm sinfo format memory

2023-07-21 Thread Michael DiDomenico

another option besides those mentioned would be to frontend sinfo with jq/python and parse the data through json/yaml On Thu, Jul 20, 2023 at 12:28 PM Arsene Marian Alain wrote: > > > > Dear slurm users, > > > > I would like to see the following information of my nodes "hostname, total > mem, fr

[slurm-users] Re: Job running slower when using Slurm

[slurm-users] Re: Job running slower when using Slurm

[slurm-users] Re: Job not starting

[slurm-users] sbatch and --nodes

Re: [slurm-users] sacct runtime performance varies on job status codes

Re: [slurm-users] stopping job array after N failed jobs in row

Re: [slurm-users] slurm sinfo format memory

7 matches

Site Navigation

Mail list logo

Footer information