In this case, I would run LINPACK on each generation of node (either the
full node or just one core), and then somehow normalize performance. I
would recommend using the performance of a single core of the slowest
node as your basis for normalization so it has a multiplier of 1, and
then the newer systems would have a multiplier greater than 1. Then you
can take that multiplier and multiply it by the number of cores in your
different systems to get a final multiplier for a while node, if needed.
Prentice
On 6/19/19 3:30 PM, Fulcomer, Samuel wrote:
(...and yes, the name is inspired by a certain OEM's software
licensing schemes...)
At Brown we run a ~400 node cluster containing nodes of multiple
architectures (Sandy/Ivy, Haswell/Broadwell, and Sky/Cascade)
purchased in some cases by University funds and in others by
investigator funding (~50:50). They all appear in the default SLURM
partition. We have 3 classes of SLURM users:
1. Exploratory - no-charge access to up to 16 cores
2. Priority - $750/quarter for access to up to 192 cores (and with a
GrpTRESRunMins=cpu limit). Each user has their own QoS
3. Condo - an investigator group who paid for nodes added to the
cluster. The group has its own QoS and SLURM Account. The QoS
allows use of the number of cores purchased and has a much higher
priority than the QoS' of the "priority" users.
The first problem with this scheme is that condo users who have
purchased the older hardware now have access to the newest without
penalty. In addition, we're encountering resistance to the idea of
turning off their hardware and terminating their condos (despite MOUs
stating a 5yr life). The pushback is the stated belief that the
hardware should run until it dies.
What I propose is a new TRES called a Processor Performance Unit (PPU)
that would be specified on the Node line in slurm.conf, and used such
that GrpTRES=ppu=N was calculated as the number of allocated cores
multiplied by their associated PPU numbers.
We could then assign a base PPU to the oldest hardware, say, "1" for
Sandy/Ivy and increase for later architectures based on performance
improvement. We'd set the condo QoS to GrpTRES=ppu=N*X+M*Y,..., where
N is the number of cores of the oldest architecture multiplied by the
configured PPU/core, X, and repeat for any newer nodes/cores the
investigator has purchased since.
The result is that the investigator group gets to run on an
approximation of the performance that they've purchased, rather on the
raw purchased core count.
Thoughts?