Hi All,
i am trying to hold the job from Scontol but not able to hold the job.
i am not able to understand..can any one please explain the concept of Hold
and Release, Suspend and Resume.
Please find the below steps which i have tried.
[root@master ~]# cat test.sh
#!/bin/bash
#SBATCH -N 1
#SBATC
Hi All,
I have created accounts and users on a cluster and I have login with one of
my users and submitted the job, job is completed but output file is not
created can anyone help me on this.
[zain@smaster ~]$ sacctmgr list associations cluster=scluster
format=Account,Cluster,User,Fairshare tree
Hi all,
Please help us to explain the details for below scenario’s
We have two nodes(smaster, snode) with 16 cores each, Totally 32cores and
We have 3 Departments with 4 Students each.
1. How to create a cluster (ex: hpcc)
2. How to create an account(D1) and add to a specific cluster(hpcc).
3. How
Thanks for your input* Loris Bennett,*
Now i am able to submit jobs on both nodes.
--
*Regards*
*Zain*
On Thu, Feb 4, 2021 at 5:04 PM Zainul Abiddin wrote:
> Hi All,
>
> Please help me to submit a simple job on master and compute nodes.
> Here is the my commands
>
> [root
Hi All,
Please help me to submit a simple job on master and compute nodes.
Here is the my commands
[root@smaster ~]# sinfo -Nl
Thu Feb 04 16:54:58 2021
NODELIST NODES PARTITION STATE CPUSS:C:T MEMORY TMP_DISK WEIGHT
AVAIL_FE REASON
smaster1 hpc*idle 4 4:1:1
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debugup infinite 1 idle snode
hpc* up infinite 1 idle smaster
[root@smaster ~]#
Regards,
Zain
On Tue, Feb 2, 2021 at 6:35 PM Zainul Abiddin wrote:
> Hi All,
> I have done slurmdbd configuration and while i am try
Hi All,
I have done slurmdbd configuration and while i am trying to run account
manager with *sacct* i am getting below error.
[root@smaster ~]# sacct
sacct: error: slurm_persist_conn_open_without_init: failed to open
persistent connection to host:localhost:6819: Connection refused
sacct: error: S
)
MAC: sha1 (3)
ZIP: none (0)
UID: root (0)
GID: root (0)
LENGTH: 0
[root@smaster ~]#
Regards,
Zain
On Tue, Feb 2, 2021 at 6:05 PM Zainul Abiddin wrote:
> Hi,
> [root@smaster ~]# munge -n | unmunge
> STATUS: Success (0)
Hi All,
Please help me to resolve this issue
My compute node (snode) status is UNKNOWN and Reason=NO NETWORK ADDRESS
FOUND
Master node (smaster) :
[root@smaster ~]# cat /etc/slurm/slurm.conf
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# S
#
On Tue, Feb 2, 2021 at 6:00 PM Zainul Abiddin wrote:
> Hi All,
> I am new to Slurm and trying to setup Slurm20.11.2 on Centos 7
> My environment is Master node (smaster) + compute Node (snode)
> and i am using
> https://www.slothparadise.com/how-to-install-slurm-on-centos-7-clus
Hi All,
I am new to Slurm and trying to setup Slurm20.11.2 on Centos 7
My environment is Master node (smaster) + compute Node (snode)
and i am using
https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/
link to setup Slurm on Master and compute nodes.
I have tried installing Mung
11 matches
Mail list logo