juli 2019 12:32
*To:* "Slurm User Community List"
*Subject:* Re: [slurm-users] Running pyMPI on several nodes
srun: error: Application launch failed: Invalid node name specified
Hearns Law. All batch system problems are DNS problems.
Seriously though - check out your name resolution both on
mented the proposed changes but still no luck.
Best regards,
Palle
*From:* "slurm-users"
*Sent:* 16 juli 2019 12:32
*To:* "Slurm User Community List"
*Subject:* Re: [slurm-users] Running pyMPI
rds,
Palle
*From:* "slurm-users"
*Sent:* 16 juli 2019 12:32
*To:* "Slurm User Community List"
*Subject:* Re: [slurm-users] Running pyMPI on several nodes
srun: error: Application launch failed: Invalid node name specified
Hearns Law. All batch system problems are DNS problems.
regards,
Palle
*From:* "slurm-users"
*Sent:* 16 juli 2019 12:32
*To:* "Slurm User Community List"
*Subject:* Re: [slurm-users] Running pyMPI on several nodes
srun: error: Application launch faile
proposed changes but still no luck.
Best regards,
Palle
From: "slurm-users"
Sent: 16 juli 2019 12:32
To: "Slurm User Community List"
Subject: Re: [slurm-users] Running pyMPI on several nodes
srun: error: Application launch failed: Invalid
gt; Hi,
>
> Thank you so much for your quick responses!
> It is much appreciated.
> I dont have access to the cluster until next week, but I’ll be sure to
> follow up on all of your suggestions and get back you next week.
>
> Have a nice weekend!
> Best regards
> Pall
-----
*From:* "slurm-users"
*Sent:* 12 juli 2019 17:37
*To:* "Slurm User Community List"
*Subject:* Re: [slurm-users] Running pyMPI on several nodes
Par, by 'poking around' Crhis means to use tools such as netstat and
lsof.
Also I would look as ps -eaf --fores
slurm-users"
Sent: 12 juli 2019 17:37
To: "Slurm User Community List"
Subject: Re: [slurm-users] Running pyMPI on several nodes
Par, by 'poking around' Crhis means to use tools such as netstat and lsof.
Also I would look as ps -eaf --forest to make sure there are no
Par, by 'poking around' Crhis means to use tools such as netstat and lsof.
Also I would look as ps -eaf --forest to make sure there are no 'orphaned'
jusbs sitting on that compute node.
Having said that though, I have a dim memory of a classic PBSPro error
message which says something about a netw
On 12/7/19 7:39 am, Pär Lundö wrote:
Presumably, the first 8 tasks originates from the first node (in this
case the lxclient11), and the other node (lxclient10) response as
predicted.
That looks right, it seems the other node has two processes fighting
over the same socket and that's breakin
urmd/lxclient10_83.0´: No such file or directory
[2019-07-12T14:57:56.019][83.0] done with job
"
Best regards
Palle
From: "slurm-users"
Sent: 12 juli 2019 08:46
To: "Slurm User Community List"
Subject: Re: [slurm-users] Running pyMPI
Have you tried
srun -N# -n# mpirun python3
Perhaps you have no MPI environment being setup for the processes? There was
no "--mpi" flag in your "srun" command and we don't know if you have a default
value for that or not.
> On Jul 12, 2019, at 10:28 AM, Chris Samuel wrote:
On 11/7/19 11:04 pm, Pär Lundö wrote:
It works fine running on a single node(with ”-N1” instead of ”-N2”), but
it is aborted or stopped when running on two nodes.
What is the error you get?
Does the same srun command but with "hostname" instead of Python work?
--
Chris Samuel : http://www
I am trouble using or running a python-mpi program involving more than one
node. The pythom-mpi program is very simple,
do you think there's something unique about the python program?
(also, you mean mpi4py, right?)
Since authentication with Slurm is used via munge, do I need a passwordless
S
Please try something very simple such as a hello world program or
srun -N2 -n8 hostname
What is the error message which you have ?
On Fri, 12 Jul 2019 at 07:07, Pär Lundö wrote:
>
> Hi there Slurm-experts!
> I am trouble using or running a python-mpi program involving more than
> one node. The
MY apology. You do say that the Python program simply printe the rank - so
is a hello world program.
On Fri, 12 Jul 2019 at 07:45, John Hearns wrote:
> Please try something very simple such as a hello world program or
> srun -N2 -n8 hostname
>
> What is the error message which you have ?
>
> On
16 matches
Mail list logo