So we figured out the problem with "slurmd -C": we had run rpmbuild on the POWER9 node, but did not have the hwloc-package installed. The build process looks for this, and if not found, will apparently note use hwloc/lstopo even if installed post-build.
Now Slurm reports the expected topology for SMT4: NodeName=enki13 CPUs=160 Boards=1 *SocketsPerBoard=2* *CoresPerSocket=20* *ThreadsPerCore=4* RealMemory=583992 Best, Keith > > 1.) Slurm seems to be incapable of recognizing sockets/cores/threads on > > these systems. > [...] > > Anyone know if there is a way to get Slurm to recognize the true topology > > for POWER nodes? > > IIIRC Slurm uses hwloc for discovering topology, so "lstopo-no-graphics" might > give you some insights into whether it's showing you the right config. > > I'd be curious to see what "lscpu" and "slurmd -C" say as well. The biggest problem as I see it, is that if I have 2 20-core sockets, if I have SMT2 set this looks like 80 single-core, single-thread sockets to Slurm (see slurmd -C output below). If I have SMT4 set, it thinks there are 160 sockets. NodeName=enki13 CPUs=80 Boards=1 SocketsPerBoard=80 CoresPerSocket=1 ThreadsPerCore=1 RealMemory=583992 UpTime=0-23:20:16