Re: [Beowulf] scheduler policy design

Tim Cutts Wed, 25 Apr 2007 06:28:43 -0700


On 25 Apr 2007, at 8:42 am, Toon Knapen wrote:

Interesting. However this approach requires that the IO profile ofthe application is known.


Absolutely.

Additionally it requires the users of the application (which aregenerally not IT guys) to know and understand this info and pass iton to the scheduler when they launch their app.


Absolutely.

In your experience, do you manage to convince real-life users toprovide this info?


Not easily.  :-)

And this is the problem with getting scheduling right, and exactlywhat we were saying at the beginning of this discussion. You can'thope to schedule optimally if the scheduler doesn't know the profileof the application; the more information it knows the better the jobit will do. But if your users, like mine, can't or won't supply thisinformation, then you're very limited in what you can achieve, andyour system will be vulnerable to denial of service because ofstrange mixes of jobs starting on the machines causing them to runout of various resources, and there is basically nothing you will beable to do about it.

The compromise we ended up with is this set of LSF queues on oursystem (a cluster with about 1500 job slots):

QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS PENDRUN SUSPyesterday 500 Open:Active 200 10 - - 10 1 0normal 30 Open:Active - - - - 281 110171 0hugemem 30 Open:Active - - - - 30 3 0long 3 Open:Active - - - - 4022 29871035 0basement 1 Open:Active 300 200 - - 127 0127 0


yesterday:

a special purpose high priority queue for the "I need it yesterday"crowd. No run length limits, but very limited in terms of how manyslots the user can use.


normal:

queue intended for shortish jobs (around 1 hour). Absolute wallclock limit of 8 hours, after which jobs are killed.


long:

queue for longer jobs with an absolute wall clock limit of 24 hours.

hugemem:

special purpose queue for the two large memory SGI Altix nodes.Users submitting jobs to this queue *must* supply memoryrequirements; the submission is rejected if they do not.


basement:

queue for long running or low priority jobs. No time limits, butcan't use more than a small fraction of the total cluster.

All the queues except hugemem also have a default memory limit of 1.9GB; any job exceeding this limit is killed. If the user wants toraise this limit they can, up to 7.9 GB, but they are then forced bythe same mechanism as the hugemem queue to supply proper memoryresource requirements.


Here's an example of what happens if they don't:

--- EXAMPLE ---
14:07:31 [EMAIL PROTECTED]:~$ bsub -M 6000000 uname -a
Job submission rejected.


You are specifying your own memory limit, so you must also supply
select[mem] and rusage[mem] resource requirement parameters.  For
example:

   -M2000000 -R'select[mem>2000] rusage[mem=2000]'

Remember that memory limits are set in KB, resource memory in MB.
Sorry about that.  Blame Platform.

If you do not understand what this means, read the lsfintro manpage and
the following web page:

http://www.wtgc.org/IT/ISG/lsf/lsf_intro.shtml#resources

If you still don't understand after that, contact ssg-isg(at)sanger.ac.uk


Request aborted by esub. Job not submitted.
--- EXAMPLE ---

All this is designed so that users who can't or won't supply detailedparameters to LSF can still submit work, but they either are limitedin terms of how many jobs they can run at once (in the yesterday andbasement queues) or they run the risk of their job being killed if itgoes astray and uses too much time or memory (in the normal and longqueues).

Thus, it gives the users an incentive to understand their code anduse the cluster carefully and responsibly. Until we put the hard runlimits in place, the cluster was being brought to its knees at leastonce a week by users just being careless, and that was why weeventually had to be somewhat more draconian. It's worked though;the cluster has not had a similar DoS event since putting these rulesinto place.


Regards,

Tim
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] scheduler policy design

Reply via email to