Hello,

A few of our users have asked about running longer jobs on our cluster. 
Currently our main/default compute partition has a time limit of 2.5 days. 
Potentially, a handful of users need jobs to run up to 5 hours. Rather than 
allow all users/jobs to have a run time limit of 5 days I wondered if the 
following scheme makes sense...


Increase the max run time on the default partition to be 5 days, however limit 
most users to a max of 2.5 days using the default "normal" QOS.


Create a QOS called "long" with a max time limit of 5 days. Limit the user who 
can use "long". For authorized users assign "long" QOS to their jobs on basis 
of run time request.


Does the above make sense or is it too complicated? If the above works could 
users limited to using the normal QOS have their running jobs run time 
increased to 5 days in exceptional circumstances?


I would be interested in your thoughts, please.


Best regards,

David

Reply via email to