Re: [slurm-users] Job Step Output Delay

2021-02-10 Thread Maria Semple
The larger cluster is using NFS. I can see how that could be related to the difference of behaviours between the clusters. The buffering behaviour is the same if I tail the file from the node running the job. The only thing that seems to change the behaviour is whether I use srun to create a job

Re: [slurm-users] Job Step Output Delay

2021-02-10 Thread Aaron Jackson
Is it being written to NFS? You say on your local dev cluster it's a single node. Is it also the login node as well as compute? In that case I guess there is no NFS. Larger cluster will be using some sort of shared storage, so whichever shared file system you are using likely has caching. If you a

Re: [slurm-users] Job Step Output Delay

2021-02-10 Thread Maria Semple
Hi Sean, Thanks for your suggestion! Adding the -u flag does not seem to have an impact on whether data is buffered. I also tried adding stdbuf -o0 before the call to srun, to no avail. Best, Maria On Wed, Feb 10, 2021 at 4:30 AM Sean Maxwell wrote: > Hi Maria, > > Have you tried adding the -

[slurm-users] Job Step Output Delay

2021-02-10 Thread Tilman Schneider
Hi Maria, seem related to srun's behavior around -u ; from the official doc *-u*, *--unbuffered* By default the connection between slurmstepd and the user launched application is over a pipe. The stdio output written by the application is buffered by the glibc until it is flushed or the output is

Re: [slurm-users] Job Step Output Delay

2021-02-10 Thread Sean Maxwell
Hi Maria, Have you tried adding the -u flag (specifies unbuffered) to your srun command? https://slurm.schedmd.com/srun.html#OPT_unbuffered Your description sounds like buffering, so this might help. Thanks, -Sean On Tue, Feb 9, 2021 at 6:49 PM Maria Semple wrote: > Hello all, > > I've noti

Re: [slurm-users] Job flexibility with cons_tres

2021-02-10 Thread Aaron Jackson
Similar problem in the cluster I look after. I have a job_submit script which adds certain nodes to the job's excluded nodes list based on each node's number of cpus per gpus. This basically solved problem with fragmentation entirely. The problem is that cons_tres seems to think (for example) tha

Re: [slurm-users] Job flexibility with cons_tres

2021-02-10 Thread Ansgar Esztermann-Kirchner
Hi Yair, thank you very much for your reply. I'll keep the points you make in mind while we're evolving our configuration toward something that can be called production-ready. A. -- Ansgar Esztermann Sysadmin Dep. Theoretical and Computational Biophysics http://www.mpibpc.mpg.de/grubmueller/esz