Hey Graziano, To make your decision more "data-driven", you can pipe your SLURM accounting logs into a tool like XDMOD which will make you pie charts of usage by user, group, job, gres, etc.
https://open.xdmod.org/8.0/index.html You may also consider assigning this task to one of your "machine learning" researchers and ask them to "predict" the resources needed. :) Regards, Alex On Thu, Mar 21, 2019 at 8:48 AM Graziano D'Innocenzo < graziano.dinnoce...@adaptcentre.ie> wrote: > Dear Slurm users, > > my team is managing a HPC cluster (running Slurm) for a research > centre. We are planning to expand the cluster in the next couple of > years and we are facing a problem. We would like to put a figure on > how many resources will be needed on average for each user (in terms > of CPU cores, RAM, GPUs) but we have almost one hundred researchers > using the cluster for all sorts of different use cases so there isn't > a typical workload that we could take as a model. Most of the work is, > however, in the field of machine learning and deep learning. Users go > all the range from first year PhD students with limited skills to > researchers and professors with many years of experience. > In principle we could use a mix of: looking at current usage patterns, > user surveys, etc. > > I was just wondering whether anyone here, working in a similar > setting, had some sort of guidelines that they have been using for > budgeting hardware purchases and that they would be willing to share? > > Many thanks and regards > > > > -- > Graziano D'Innocenzo (PGP key: 9213BE46) > Systems Administrator - ADAPT Centre > >