Re: [Beowulf] [EXTERNAL] Re: Interactive vs batch, and schedulers

Prentice Bisbal via Beowulf Fri, 17 Jan 2020 08:10:55 -0800

The problem with timeslicing is that when one job is pre-empted, it'sstate needs to be stored somewhere so the next job can run. Since manyHPC jobs are memory intensive, using RAM for this is not usually anoption. Which leaves writing the state to disk. Since disk is manyorders of magnitude slower than RAM, writing state to disk fortimeslicing would ultimately reduce the throughput of the cluster. It'smuch more efficient to have one job "own" the nodes until it completes.

Yes, jobs do checkpointing, but I'm assuming the checkpointing isn'thappening as frequently as your proposed timeslicing, and thatcheckpointing isn't writing the entire state to disk.


Prentice

On 1/17/20 12:35 AM, Lux, Jim (US 337K) via Beowulf wrote:

And I suppose there’s no equivalent of “timeslicing” where the coresrun job A for 99% of the time and job B, C, D, E, F, for 1% of the time.


*From: *Alex Chekholko <a...@calicolabs.com>
*Date: *Thursday, January 16, 2020 at 3:50 PM
*To: *Jim Lux <james.p....@jpl.nasa.gov>
*Cc: *"beowulf@beowulf.org" <beowulf@beowulf.org>
*Subject: *[EXTERNAL] Re: [Beowulf] Interactive vs batch, and schedulers

Hey Jim,

There is an inverse relationship between latency and throughput. Mostsupercomputing centers aim to keep their overall utilization high, sothe queue always needs to be full of jobs.

If you can have 1000 nodes always idle and available, then your 1000node jobs will usually take 10 seconds. But your overall utilizationwill be in the low single digit percent or worse.


Regards,

Alex

On Thu, Jan 16, 2020 at 3:25 PM Lux, Jim (US 337K) via Beowulf<beowulf@beowulf.org <mailto:beowulf@beowulf.org>> wrote:


    Are there any references out there that discuss the tradeoffs
    between interactive and batch scheduling (perhaps some from the
    60s and 70s?) –

    Most big HPC systems have a mix of giant jobs and smaller ones
    managed by some process like PBS or SLURM, with queues of various
    sized jobs.

    What I’m interested in is the idea of jobs that, if spread across
    many nodes (dozens) can complete in seconds (<1 minute) providing
    essentially “interactive” access, in the context of large jobs
    taking days to complete.   It’s not clear to me that the current
    schedulers can actually do this – rather, they allocate M of N
    nodes to a particular job pulled out of a series of queues, and
    that job “owns” the nodes until it completes.  Smaller jobs get
    run on (M-1) of the N nodes, and presumably complete faster, so it
    works down through the queue quicker, but ultimately, if you have
    a job that would take, say, 10 seconds on 1000 nodes, it’s going
    to take 20 minutes on 10 nodes.

    Jim

--

    _______________________________________________
    Beowulf mailing list, Beowulf@beowulf.org
    <mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
    To change your subscription (digest mode or unsubscribe) visit
    https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
    <https://beowulf.org/cgi-bin/mailman/listinfo/beowulf>


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Re: [Beowulf] [EXTERNAL] Re: Interactive vs batch, and schedulers

Reply via email to