Yeah, you'd think after all this time it would, bu it remains a bit of arcane knowledge that's mostly passed on in oral history....
There are some things that the slurmd processes need to be restarted for, as well. I have a vague memory that changing the debug level is one... On Mon, Jul 26, 2021 at 1:32 PM Jason Simms <sim...@lafayette.edu> wrote: > Dear Samuel, > > Restarting slurmctld did the trick. Thanks! I should have thought to do > that, but typically sconrtrol reconfigure picks up most changes. > > Warmest regards, > Jason > > On Mon, Jul 26, 2021 at 12:55 PM Fulcomer, Samuel < > samuel_fulco...@brown.edu> wrote: > >> ...and... you need to restart slurmctld when you change a NodeName line. >> "scontrol reconfigure" doesn't do the truck. >> >> On Mon, Jul 26, 2021 at 12:49 PM Fulcomer, Samuel < >> samuel_fulco...@brown.edu> wrote: >> >>> If you have a dual-root PCIe system you may need to specify the CPU/core >>> affinity in gres.conf. >>> >>> On Mon, Jul 26, 2021 at 12:07 PM Jason Simms <sim...@lafayette.edu> >>> wrote: >>> >>>> Hello all, >>>> >>>> I have a GPU node with 3 identical GPUs (we started with two and >>>> recently added the third). Running nvidia-smi correctly shows that all >>>> three are recognized. My gres.conf file has only this line: >>>> >>>> NodeName=gpu01 File=/dev/nvidia[0-2] Type=quadro_8000 Name=gpu Count=3 >>>> >>>> And the relevant lines in slurm.conf are: >>>> >>>> NodeName=gpu01 Sockets=2 CoresPerSocket=16 ThreadsPerCore=1 >>>> RealMemory=189900 State=UNKNOWN Gres=gpu:quadro_8000:3 >>>> >>>> As far as I can tell, all of this is fine (and we had no issues when we >>>> only had the initial two GPUs in the system). However, now when I run sinfo >>>> -o %G (which as I understand will report the total number of gres >>>> resources available), this is the output: >>>> >>>> GRES >>>> (null) >>>> gpu:quadro_8000:2 >>>> >>>> Is this saying that it doesn't recognize the third card? Any >>>> suggestions? As always, thank you for your help! >>>> >>>> Warmest regards, >>>> Jason >>>> >>>> -- >>>> *Jason L. Simms, Ph.D., M.P.H.* >>>> Manager of Research and High-Performance Computing >>>> XSEDE Campus Champion >>>> Lafayette College >>>> Information Technology Services >>>> 710 Sullivan Rd | Easton, PA 18042 >>>> Office: 112 Skillman Library >>>> p: (610) 330-5632 >>>> >>> > > -- > *Jason L. Simms, Ph.D., M.P.H.* > Manager of Research and High-Performance Computing > XSEDE Campus Champion > Lafayette College > Information Technology Services > 710 Sullivan Rd | Easton, PA 18042 > Office: 112 Skillman Library > p: (610) 330-5632 >