Hi Maria, seem related to srun's behavior around -u ; from the official doc *-u*, *--unbuffered* By default the connection between slurmstepd and the user launched application is over a pipe. The stdio output written by the application is buffered by the glibc until it is flushed or the output is set as unbuffered. See setbuf <https://slurm.schedmd.com/setbuf.html>(3). If this option is specified the tasks are executed with a pseudo terminal so that the application output is unbuffered. This option applies to step allocations.
Hth Tilman Message: 2 > Date: Tue, 9 Feb 2021 15:47:12 -0800 > From: Maria Semple <ma...@rstudio.com> > To: Slurm User Community List <slurm-users@lists.schedmd.com> > Subject: [slurm-users] Job Step Output Delay > Message-ID: > <CAJON5fi+V6ok3TstSxJr= > wj6+2rd2yr736-+mpvaby+pugy...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hello all, > > I've noticed an odd behaviour with job steps in some Slurm environments. > When a script is launched directly as a job, the output is written to file > immediately. When the script is launched as a step in a job, output is > written in ~30 second chunks. This doesn't happen in all Slurm > environments, but if it happens in one, it seems to always happen. For > example, on my local development cluster, which is a single node on Ubuntu > 18, I don't experience this. On a large Centos 7 based cluster, I do. > > Below is a simple reproducible example: > > loop.sh: > #!/bin/bash > for i in {1..100} > do > echo $i > sleep 1 > done > > withsteps.sh: > #!/bin/bash > srun ./loop.sh > > Then from the command line running sbatch loop.sh followed by tail -f > slurm-<job #>.out prints the job output in smaller chunks, which appears to > be related to file system buffering or the time it takes for the tail > process to notice that the file has updated. Running cat on the file every > second shows that the output is in the file immediately after it is emitted > by the script. > > If you run sbatch withsteps.sh instead, tail-ing or repeatedly cat-ing the > output file will show that the job output is written in a chunk of 30 - 35 > lines. > > I'm hoping this is something that is possible to work around, potentially > related to an OS setting, the way Slurm was compiled, or a Slurm setting. > > -- > Thanks, > Maria > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.schedmd.com/pipermail/slurm-users/attachments/20210209/3bffe170/attachment-0001.htm > > > >