Hello all.
I'm seeing (again) this weird issue.
The same executable, launched with 32 processes crashes immediately,
while it runs flawlessy with only 30 processes.
The reported error is:
[str957-bl0-03:05271] *** Process received signal ***
[str957-bl0-03:05271] Signal: Segmentation fault (11)
[
You need to provide some hints! What we know so far:
1. What we see here is a backtrace from (what looks like) an Open MPI/PMI-x
backtrace.
2. Your decision to address this to the Slurm mailing list suggests that you
think that Slurm might be involved.
3. You have something (a job? a program?) t