Dear All, Preamble ---------- I want to form simple cluster with three laptops: abhi-Latitude-E6430 //This serves as the controller abhi-Lenovo-ideapad-330-15IKB //Compute Node abhi-HP-EliteBook-840-G2 //Compute Node
Aim ------------- I want to make use of CPU+GPU+RAM on all the machines when I execute JAVA programs or Python programs. Implementation ------------------------ Now let us look at the slurm.conf On Machine abhi-Latitude-E6430 ClusterName=linux ControlMachine=abhi-Latitude-E6430 SlurmUser=abhi SlurmctldPort=6817 SlurmdPort=6818 AuthType=auth/munge SwitchType=switch/none StateSaveLocation=/tmp MpiDefault=none ProctrackType=proctrack/pgid NodeName=abhi-Lenovo-ideapad-330-15IKB RealMemory=12000 CPUs=2 NodeName=abhi-HP-EliteBook-840-G2 RealMemory=14000 CPUs=2 PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP Same slurm.conf is copied to all the Machines. Observations -------------------------------------- Now when I do abhi@abhi-HP-EliteBook-840-G2:~$ service slurmd status ● slurmd.service - Slurm node daemon Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2020-05-13 18:50:01 IST; 1min 49s ago Docs: man:slurmd(8) Process: 98235 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 98253 (slurmd) Tasks: 2 Memory: 2.2M CGroup: /system.slice/slurmd.service └─98253 /usr/sbin/slurmd abhi@abhi-Lenovo-ideapad-330-15IKB:~$ service slurmd status ● slurmd.service - Slurm node daemon Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2020-05-13 18:50:20 IST; 8s ago Docs: man:slurmd(8) Process: 71709 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 71734 (slurmd) Tasks: 2 Memory: 2.0M CGroup: /system.slice/slurmd.service └─71734 /usr/sbin/slurmd abhi@abhi-Latitude-E6430:~$ service slurmctld status ● slurmctld.service - Slurm controller daemon Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2020-05-13 18:48:58 IST; 4min 56s ago Docs: man:slurmctld(8) Process: 97114 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 97116 (slurmctld) Tasks: 7 Memory: 2.6M CGroup: /system.slice/slurmctld.service └─97116 /usr/sbin/slurmctld However abhi@abhi-Latitude-E6430:~$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST debug* up infinite 1 down* abhi-Lenovo-ideapad-330-15IKB Advice needed ------------------------ Please let me know Why I am seeing only one node. Further how the total memory is calculated? Can Slurm make use of GPU processing power as well Please let me know if I have missed something in configuration or explanation. Thank you all Best Regards,Abhinandan H. Patil, +919886406214https://www.AbhinandanHPatil.info