Re: [slurm-users] Job dispatching policy

2019-04-30 Thread Mark Hahn
Also why aren't you using the Slurm commands to run things? Which command? srun or sbatch

Re: [slurm-users] Job dispatching policy

2019-04-30 Thread Mahmood Naderan
>Also why aren't you using the Slurm commands to run things? Which command? Regards, Mahmood

Re: [slurm-users] Job dispatching policy

2019-04-29 Thread Chris Samuel
On Monday, 29 April 2019 5:18:56 AM PDT Mahmood Naderan wrote: > [mahmood@rocks7 ~]$ rocks run host compute-0-1 "file > /state/partition1/ans190/v190/Framework/bin/Linux64/runwb2" Given that file says it's a shell script, try and run it with this to see what doesn't work: rocks run host compute

Re: [slurm-users] Job dispatching policy

2019-04-29 Thread Prentice Bisbal
I see two separate, unrelated problems here: Problem 1: Warning: untrusted X11 forwarding setup failed: xauth key data not generated What have you done to investigate this xauth problem further? I know there have been discussions about this problem in the past on this mailing list. Did you

Re: [slurm-users] Job dispatching policy

2019-04-29 Thread Mahmood Naderan
[mahmood@rocks7 ~]$ rocks run host compute-0-1 "file /state/partition1/ans190/v190/Framework/bin/Linux64/runwb2" Warning: untrusted X11 forwarding setup failed: xauth key data not generated /state/partition1/ans190/v190/Framework/bin/Linux64/runwb2: POSIX shell script, ASCII text executable [mahmoo

Re: [slurm-users] Job dispatching policy

2019-04-27 Thread Chris Samuel
On 27/4/19 2:20 am, Mahmood Naderan wrote: ./workbench.sh: line 4: /state/partition1/ans190/v190/Framework/bin/Linux64/runwb2: No such file or directory That doesn't look like it's related to Slurm to me, if the file itself exists then my suspicion is that it's a script and the interpreter i

Re: [slurm-users] Job dispatching policy

2019-04-27 Thread Mahmood Naderan
>More constructively - maybe the list can help you get the X11 applications to run using Slurm. >Could you give some details please? For example, I an not run this GUI program with salloc [mahmood@rocks7 ~]$ cat workbench.sh #!/bin/bash unset SLURM_GTIDS /state/partition1/ans190/v190/Framewor

Re: [slurm-users] Job dispatching policy

2019-04-24 Thread John Hearns
I would suggest that if those applications really are not possible with Slurm - then reserve a set of nodes for interactive use and disable the Slurm daemon on them. Direct users to those nodes. More constructively - maybe the list can help you get the X11 applications to run using Slurm. Could yo

Re: [slurm-users] Job dispatching policy

2019-04-23 Thread Mahmood Naderan
Thanks for the info. Thing is that I don't want to totally set the node as unhealthy. Assume the following scenarios: compute-0-0 running slurm jobs and system load is 15 (32 cores) compute-0-1 running non-slurm jobs and system load is 25 (32 cores) Then a new slurm job should be dispatched to com

Re: [slurm-users] Job dispatching policy

2019-04-23 Thread Prentice Bisbal
On 4/23/19 2:47 AM, Mahmood Naderan wrote: Hi, How can I change the job distribution policy? Since some nodes are running non-slurm jobs, it seems that the dispatcher isn't aware of system load. Therefore, it assumes that the node is free. I want to change the policy based on the system load

Re: [slurm-users] Job dispatching policy

2019-04-23 Thread Richard Randriatoamanana
Hi Mahmood, Try the LBNL Node Health Check tool. Nodes which are determined to be "unhealthy" can be marked as down or offline so as to prevent jobs from being scheduled or run on them. https://github.com/mej/nhc/blob/master/README.md#lbnl-node-health-check-nhc Regards, Richard @cnscfr -- Sent

[slurm-users] Job dispatching policy

2019-04-22 Thread Mahmood Naderan
Hi, How can I change the job distribution policy? Since some nodes are running non-slurm jobs, it seems that the dispatcher isn't aware of system load. Therefore, it assumes that the node is free. I want to change the policy based on the system load. Regards, Mahmood