from:"\"Jason Simms via slurm\\\-users\""

[slurm-users] Recover Batch Script Error

2024-02-16 Thread Jason Simms via slurm-users

Hello all, I've used the "scontrol write batch_script" command to output the job submission script from completed jobs in the past, but for some reason, no matter which job I specify, it tells me it is invalid. Any way to troubleshoot this? Alternatively, is there another way - even if a manual da

[slurm-users] Re: Question about IB and Ethernet networks

2024-02-25 Thread Jason Simms via slurm-users

Hello Daniel, In my experience, if you have a high-speed interconnect such as IB, you would do IPoIB. You would likely still have a "regular" Ethernet connection for management purposes, and yes that means both an IB switch and an Ethernet switch, but that switch doesn't have to be anything specia

[slurm-users] Re: pty jobs are killed when another job on the same node terminates

2024-02-28 Thread Jason Simms via slurm-users

Hello Thomas, I know I'm a few days late to this, so I'm wondering whether you've made any progress. We experience this, too, but in a different way. First, though, you may be aware, but you should use salloc rather than srun --pty for an interactive session. That's been the preferred method for

[slurm-users] Re: Enforcing relative resource restrictions in submission script

2024-02-28 Thread Jason Simms via slurm-users

Hello Matthew, You may be aware of this already, but most sites would make these kinds of checks/validations using job_submit.lua. I'm not an expert in that - though plenty of others on this list are - but I'm positive you could implement this type of validation logic. I'd like to say that I've co

[slurm-users] Re: Munge log-file fills up the file system to 100%

2024-04-16 Thread Jason Simms via slurm-users

As a related point, for this reason I mount /var/log separately from /. Ask me how I learned that lesson... Jason On Tue, Apr 16, 2024 at 8:43 AM Jeffrey T Frey via slurm-users < slurm-users@lists.schedmd.com> wrote: > AFAIK, the fs.file-max limit is a node-wide limit, whereas "ulimit -n" > is p

[slurm-users] Trying to Track Down root Usage

2024-04-29 Thread Jason Simms via slurm-users

Hello all, Each week, I generate an automated report of the top users by CPU hours. This week, for whatever reason the user root accounted for a massive number of hours: Login Proper Name Used A

[slurm-users] Re: Trying to Track Down root Usage

2024-04-29 Thread Jason Simms via slurm-users

r user root in place? > > sreport accounts resources reserved for a user as well (even if not > used by jobs) while sacct reports job accounting only. > > Best regards > Jürgen > > > * Jason Simms via slurm-users [240429 > 10:47]: > > Hello all, > > > >

[slurm-users] Partition Preemption Configuration Question

2024-05-02 Thread Jason Simms via slurm-users

Hello all, The Slurm docs have me a bit confused... I'm wanting to enable job preemption on certain partitions but not others. I *presume* I would set PreemptType=preempt/partition_prio globally, but then on the partitions where I don't want jobs to be able to be preempted, I would set PreemptMode

[slurm-users] Re: With slurm, how to allocate a whole node for a single multi-threaded process?

2024-08-01 Thread Jason Simms via slurm-users

On the one hand, you say you want "to *allocate a whole node* for a single multi-threaded process," but on the other you say you want to allow it to "*share nodes* with other running jobs." Those seem like mutually exclusive requirements. Jason On Thu, Aug 1, 2024 at 1:32 PM Henrique Almeida via

[slurm-users] Re: salloc not starting shell despite LaunchParameters=use_interactive_step

2024-09-05 Thread Jason Simms via slurm-users

I know this doesn't particularly help you, but for me on 23.11.6 it works as expected and immediately drops me onto the allocated node. In answer to your question, yes, as I understand it the default/expected behavior is to return the shell directly. Jason On Thu, Sep 5, 2024 at 8:18 AM Loris Ben

[slurm-users] Re: salloc not starting shell despite LaunchParameters=use_interactive_step

2024-09-05 Thread Jason Simms via slurm-users

Ours works fine, however, without the InteractiveStepOptions parameter. JLS On Thu, Sep 5, 2024 at 9:53 AM Carsten Beyer via slurm-users < slurm-users@lists.schedmd.com> wrote: > Hi Loris, > > we use SLURM 23.02.7 (Production) and 23.11.1 (Testsystem). Our config > contains a second parameter In

[slurm-users] Re: First setup of slurm with a GPU node

2024-11-13 Thread Jason Simms via slurm-users

Hello Patrick, Yeah I'd recommend upgrading, and I imagine most others will, too. I have found with Slurm that upgrades are nearly mandatory, at least annually or so, mostly because it's more challenging to upgrade from much older versions and requires bootstrapping. Not sure about the minus sign;

[slurm-users] Best Way to See GPUs in Use?

2025-04-02 Thread Jason Simms via slurm-users

Hello all, Apologies for the basic question, but is there a straightforward, best-accepted method for using Slurm to report on which GPUs are currently in use? I've done some searching and people recommend all sorts of methods, including parsing the output of nvidia-smi (seems inefficient, especia

[slurm-users] Re: X11 performance terrible using plugin

2025-06-06 Thread Jason Simms via slurm-users

It may or may not be an appropriate solution for your use cases, but I second using Open OnDemand and its virtual desktop. It is FAR more performant than X11 through Slurm/SSH. *Jason L. Simms, Ph.D., M.P.H.* Research Computing Manager Swarthmore College Information Technology Services (610) 328-8

[slurm-users] Recover Batch Script Error

[slurm-users] Re: Question about IB and Ethernet networks

[slurm-users] Re: pty jobs are killed when another job on the same node terminates

[slurm-users] Re: Enforcing relative resource restrictions in submission script

[slurm-users] Re: Munge log-file fills up the file system to 100%

[slurm-users] Trying to Track Down root Usage

[slurm-users] Re: Trying to Track Down root Usage

[slurm-users] Partition Preemption Configuration Question

[slurm-users] Re: With slurm, how to allocate a whole node for a single multi-threaded process?

[slurm-users] Re: salloc not starting shell despite LaunchParameters=use_interactive_step

[slurm-users] Re: salloc not starting shell despite LaunchParameters=use_interactive_step

[slurm-users] Re: First setup of slurm with a GPU node

[slurm-users] Best Way to See GPUs in Use?

[slurm-users] Re: X11 performance terrible using plugin

14 matches

Site Navigation

Mail list logo

Footer information