Hi - Intel Cluster Checker person chiming in. 

To answer your question Prentice about runtime of Cluster Checker (CLCK), this 
will depend on which set of tests or framework definition (FWD) you use and the 
number of servers. The default fwd, is health_base which should run in a matter 
of seconds. It was designed to run quickly and be a sanity check before running 
jobs. Other FWDs are designed for cluster hand-off and validation; so these 
will take much longer as they run a multitude of different benchmarks on 
individual nodes (stream/dgemm/sgemm/...) and across the cluster 
(hpcg/hpl/pairwise imb/...) looking for outliers. Which can take 90+ minutes to 
multiple hours depending on the system configuration and size. Of course there 
are inbetween tests also such as health_extended_user or mpi_prereq_user. 

Couple of tips - clck -X list is a great way to see what framework definitions 
exist. clck -X <name_of_fwd> will give you more details on what is being 
checked for the specific fwd. 

Thanks for using cluster checker and providing feedback. Happy to help further. 

-Brady

> -----Original Message-----
> From: Beowulf <beowulf-boun...@beowulf.org> On Behalf Of Michael Di
> Domenico
> Sent: Thursday, April 30, 2020 10:23
> Cc: Beowulf Mailing List <beowulf@beowulf.org>
> Subject: Re: [Beowulf] Intel Cluster Checker
> 
> i played with it about a year ago since i get it as part of the intel compiler
> bundle we pay for.  it was overly complicated to install and run and didn't
> seem worth while.  kind of like getting a piece of ikea furniture but then
> trying to use a phillips screw driver to build it instead of the little 
> wrench.
> otherwise when i dug into what it was actually doing, it didn't seem to be
> doing anything magical.  it was just doing it 'the intel way', which in my
> experience is generally very strange
> 
> 
> 
> On Wed, Apr 29, 2020 at 4:07 PM Prentice Bisbal via Beowulf
> <beowulf@beowulf.org> wrote:
> >
> > Beowulfers,
> >
> > Have any of you used the Intel Cluster Checker? I've been tasked with
> > using it, and I think I have it running, but the documentation isn't
> > very good. I was wondering how long a typical run on some cluster
> > nodes should take.
> >
> > Prentice
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
> > Computing To change your subscription (digest mode or unsubscribe)
> > visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
> Computing To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to