Could be this quote from the srun man page:
-u, --unbuffered
By default the connection between slurmstepd and the user launched application
is over a pipe. The stdio output written by the application is buffered by the
glibc until it is flushed or the output is set as unbuffered. See setbuf(3). I
Could be this quote from the srun man page:
-u, --unbuffered
By default the connection between slurmstepd and the user launched application
is over a pipe. The stdio output written by the application is buffered by the
glibc until it is flushed or the output is set as unbuffered. See setbuf(3). I
The larger cluster is using NFS. I can see how that could be related to the
difference of behaviours between the clusters.
The buffering behaviour is the same if I tail the file from the node
running the job. The only thing that seems to change the behaviour is
whether I use srun to create a job
Is it being written to NFS? You say on your local dev cluster it's a
single node. Is it also the login node as well as compute? In that case
I guess there is no NFS. Larger cluster will be using some sort of
shared storage, so whichever shared file system you are using likely has
caching.
If you a
Hi Sean,
Thanks for your suggestion!
Adding the -u flag does not seem to have an impact on whether data is
buffered. I also tried adding stdbuf -o0 before the call to srun, to no
avail.
Best,
Maria
On Wed, Feb 10, 2021 at 4:30 AM Sean Maxwell wrote:
> Hi Maria,
>
> Have you tried adding the -
15:47:12 -0800
> From: Maria Semple
> To: Slurm User Community List
> Subject: [slurm-users] Job Step Output Delay
> Message-ID:
> wj6+2rd2yr736-+mpvaby+pugy...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hello all,
>
> I
Hi Maria,
Have you tried adding the -u flag (specifies unbuffered) to your srun
command?
https://slurm.schedmd.com/srun.html#OPT_unbuffered
Your description sounds like buffering, so this might help.
Thanks,
-Sean
On Tue, Feb 9, 2021 at 6:49 PM Maria Semple wrote:
> Hello all,
>
> I've noti
Hello all,
I've noticed an odd behaviour with job steps in some Slurm environments.
When a script is launched directly as a job, the output is written to file
immediately. When the script is launched as a step in a job, output is
written in ~30 second chunks. This doesn't happen in all Slurm
envir