And now, a few hours later - with no changes made - everyone has the same fairshare?
$ sshare -l -a Account User RawShares NormShares RawUsage NormUsage EffectvUsage FairShare GrpTRESMins TRESRunMins -------------------- ---------- ---------- ----------- ----------- ----------- ------------- ---------- ------------------------------ ------------------------------ root 0.000000 63235972 0.000000 1.000000 cpu=188835,mem=1546941371,ene+ root root 1 0.008264 0 0.000000 0.000000 1.000000 cpu=0,mem=0,energy=0,node=0,b+ mic 120 0.991736 63235972 1.000000 1.000000 0.497120 cpu=188835,mem=1546941371,ene+ mic aamedina parent 0.991736 2351906 0.037193 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic aaruldass parent 0.991736 0 0.000000 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic acataldo parent 0.991736 14637614 0.231476 1.000000 0.497120 cpu=188031,mem=1540350361,ene+ mic achowdhury parent 0.991736 0 0.000000 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic ajajoo parent 0.991736 2053441 0.032473 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic ajanes parent 0.991736 0 0.000000 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic amandacao parent 0.991736 200 0.000003 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic aromer parent 0.991736 0 0.000000 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic aweerasek+ parent 0.991736 1048 0.000017 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic batwood parent 0.991736 0 0.000000 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic bleng parent 0.991736 3 0.000000 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic cdemirlek parent 0.991736 6110 0.000097 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ mic chun parent 0.991736 0 0.000000 1.000000 0.497120 cpu=0,mem=0,energy=0,node=0,b+ I am so confused. On Aug 10, 2024, at 8:11 AM, Drucker, Daniel <ddruc...@mclean.harvard.edu> wrote: Hmm, no. That solved the problem of everyone having the same FairShare, but even after restarting slurmd and doing reconfigure, if I submit a job as someone with a huge usage and someone with zero usage, they both end up with the same Priority. On Aug 10, 2024, at 8:05 AM, Daniel M. Drucker <ddruc...@mclean.harvard.edu> wrote: I just set PriorityFlags=NO_FAIR_TREE and this seems to have solved the problem! On Aug 10, 2024, at 7:45 AM, Drucker, Daniel <ddruc...@mclean.harvard.edu> wrote: According to https://docs.rc.fas.harvard.edu/kb/fairshare/ and https://slurm.schedmd.com/SUG14/fair_tree.pdf : "The Fairshare score is calculated using the following formula.f = 2^(-EffectvUsage/NormShares)" This is clearly not happening on my system: Account User RawShares NormShares RawUsage NormUsage EffectvUsage FairShare LevelFS GrpTRESMins TRESRunMins -------------------- ---------- ---------- ----------- ----------- ----------- ------------- ---------- ---------- ------------------------------ ------------------------------ ... mic acataldo parent 0.991736 13066208 0.210193 0.210193 0.983871 cpu=169648,mem=1389757781,ene+ mic achowdhury parent 0.991736 0 0.000000 0.000000 0.983871 cpu=0,mem=0,energy=0,node=0,b+ ... Every user has 0.991736 NormShares. Acataldo has EffectvUsage = 0.210193 Achowdhury has EffectvUsage = 0 But both users have the same FairShare. The correct values according to the above formula would be 0.863 and 1.0 respectively. So what's going on? On Aug 10, 2024, at 7:36 AM, Daniel M. Drucker <ddruc...@mclean.harvard.edu> wrote: Here is what is confusing me I guess. Look at the below. You can see that some people have no usage and some people have a lot of usage. But their FairShare value is all identical. https://lists.schedmd.com/mailman3/hyperkitty/list/slurm-users@lists.schedmd.com/thread/I53OEJSNBT2BMXYVFEFHQQKKAHIUYA53/ seems to say that fairshare=parent should work just fine, but what I am seeing is that it is NOT altering people's FairShare? The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Mass General Brigham Compliance HelpLine at https://www.massgeneralbrigham.org/complianceline <https://www.massgeneralbrigham.org/complianceline> . Please note that this e-mail is not secure (encrypted). If you do not wish to continue communication over unencrypted e-mail, please notify the sender of this message immediately. Continuing to send or respond to e-mail after receiving this message means you understand and accept this risk and wish to continue to communicate over unencrypted e-mail.
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com