Re: [Beowulf] Large amounts of data to store and process

Prentice Bisbal via Beowulf Fri, 15 Mar 2019 13:08:44 -0700

HPC and desktop applications have different characteristics, so what isa good optimization for an HPC workload would probably be very bad for adesktop workload.

For example, the Compute Node Kernel (CNK) used on IBM Blue Genes is aminimal *single-tasking* operating system that only supports a subset ofthe Linux system call API (CNK is *not* Linux, but duplicates it's APIto make it easier to port codes to the blue gene). This is because ablue gene node should only be running a single application at a time,and the missing system calls are not really needed by HPC applications.By not providing multitasking and those non-essential system calls, IBMwas able to minimize the size and overhead of the CNK to optimize HPCtask performance.


Now imagine a desktop application that didn't support multitasking....

What makes you think that OSes and hardware vendors haven't alreadyimproved communications between cores and improve thread management?When AMD released HyperTransport, that was a huge leap forward for ininterprocessor and processor-memory communication that benefitted HPCbut could also be found in a decent desktop system. And then Intel didthe same with QuickPath Interconnet (QPI). Is that more along the linesof what you're getting at? Both of those were big improvements, which Ibelieve trickled down to desktop systems. (I know HT did, not asfamiliar with Intel desktop processors).


Prentice

On 3/15/19 3:42 PM, Jonathan Aquilina wrote:

I think what im after is more cant it be adapted to such a situation as 
everyday desktop's to potentially improve communication between cores and maybe 
manage the threads?

On 15/03/2019, 17:28, "Prentice Bisbal" <pbis...@pppl.gov> wrote:

     We are definitely going that way, but for every day desktops, MPI is not
     the way to go. Since most desktops are stand-alone islands,
     multi-threading makes more sense, since it has less overhead compared to
     MPI, and most desktop apps don't need the inter-node communications
     provided by MPI.

--

     Prentice

On 3/15/19 1:28 AM, Jonathan Aquilina wrote:

     > I think what I was getting at is why not include the current HPC 
practices to every day desktops in the sense since we are reaching certain limits 
and have to write code to take advantage of more and more cores. Why not use MPI 
and the like to help distribute the software side of things to the cores?
     >
     > It could be my entire concept of MPI is way of or im misunderstanding 
completely how it works.
     >
     > -----Original Message-----
     > From: Beowulf <beowulf-boun...@beowulf.org> On Behalf Of Prentice Bisbal 
via Beowulf
     > Sent: 14 March 2019 20:01
     > To: beowulf@beowulf.org
     > Subject: Re: [Beowulf] Large amounts of data to store and process
     >
     >> Then given we are reaching these limitations how come we don’t 
integrate certain things from the HPC world into every day computing so to speak.
     > We have. How many cores does your smartphone have?
     >
     > But in most cases over the past 25 years, HPC has been about 
incorporating every day computing in HPC, not the other way around. For example, 
the first beowulf clusters took desktop PCs and standard networking to build 
supercomputers in the early/mid-90s. Then 10 years later, in the early/mid-00s, 
HPC took GPUs and starting doing programming gymnastics to get the vector 
processors on the GPUs to do physics calculations.
     >
     > --
     > Prentice
     >
     > On 3/14/19 2:35 PM, Jonathan Aquilina wrote:
     >> Then given we are reaching these limitations how come we don’t 
integrate certain things from the HPC world into every day computing so to speak.
     >>
     >> On 14/03/2019, 19:14, "Douglas Eadline" <deadl...@eadline.org> wrote:
     >>
     >>
     >>       > Hi Douglas,
     >>       >
     >>       > Isnt there quantum computing being developed in terms of CPUs 
at this
     >>       > point?
     >>
     >>       QC is (theoretically) unreasonably good at some things at other
     >>       there may me classic algorithms that work better. As far as I 
know,
     >>       there has been no demonstration of "quantum
     >>       supremacy" where a quantum computer is shown
     >>       to be faster than a classical algorithm.
     >>
     >>       Getting there, not there yet.
     >>
     >>       BTW, if you want to know what is going on with QC
     >>       read Scott Aaronson's blog
     >>
     >>       https://www.scottaaronson.com/blog/
     >>
     >>       I usually get through the first few paragraphs and
     >>       then whoosh over my scientific pay grade
     >>
     >>
     >>       > Also is it really about the speed any more rather then how
     >>       > optimized the code is to take advantage of the multiple cores 
that a
     >>       > system has?
     >>
     >>       That is because the clock rate increase slowed to a crawl.
     >>       Adding cores was a way to "offer" more performance, but introduced
     >>       the "multi-core tax." That is, programing for multi-core is
     >>       harder and costlier than a single core. Also, much
     >>       harder to optimize. In HPC we are lucky, we are used to
     >>       designing MPI codes that scale with more cores (no mater
     >>       where they live, same die, next socket, another server).
     >>
     >>       Also, more cores usually means lower single core
     >>       frequency to fit into a given power envelope (die shrinks help
     >>       with this but based on everything I have read, we are about
     >>       at the end of the line) It also means lower absolute memory
     >>       BW per core although more memory channels help a bit.
     >>
     >>       --
     >>       Doug
     >>
     >>
     >>       >
     >>       > ï»¿On 13/03/2019, 22:22, "Douglas Eadline" 
<deadl...@eadline.org> wrote:
     >>       >
     >>       >
     >>       >     I realize it is bad form to reply ones own post and
     >>       >     I forgot to mention something.
     >>       >
     >>       >     Basically the HW performance parade is getting harder
     >>       >     to celebrate. Clock frequencies have been slowly
     >>       >     increasing while cores are multiply rather quickly.
     >>       >     Single core performance boosts are mostly coming
     >>       >     from accelerators. Added to the fact that speculation
     >>       >     technology when managed for security, slows things down.
     >>       >
     >>       >     What this means, the focus on software performance
     >>       >     and optimization is going to increase because we can just
     >>       >     buy new hardware and improve things anymore.
     >>       >
     >>       >     I believe languages like Julia can help with this situation.
     >>       >     For a while.
     >>       >
     >>       >     --
     >>       >     Doug
     >>       >
     >>       >     >> Hi All,
     >>       >     >> Basically I have sat down with my colleague and we have 
opted to go
     >>       > down
     >>       >     > the route of Julia with JuliaDB for this project. But 
here is an
     >>       >     > interesting thought that I have been pondering if Julia 
is an up
     >>       > and
     >>       >     > coming fast language to work with for large amounts of 
data how
     >>       > will
     >>       >     > that
     >>       >     >> affect HPC and the way it is currently used and HPC 
systems
     >>       > created?
     >>       >     >
     >>       >     >
     >>       >     > First, IMO good choice.
     >>       >     >
     >>       >     > Second a short list of actual conversations.
     >>       >     >
     >>       >     > 1) "This code is written in Fortran." I have been met with
     >>       >     > puzzling looks when I say the the word "Fortran." Then it
     >>       >     > comes, "... ancient language, why not port to modern ..."
     >>       >     > If you are asking that question young Padawan you have
     >>       >     > much to learn, maybe try web pages"
     >>       >     >
     >>       >     > 2) I'll just use Python because it works on my Laptop.
     >>       >     > Later, "It will just run faster on a cluster, right?"
     >>       >     > and "My little Python program is now kind-of big and has
     >>       >     > become slow, should I use TensorFlow?"
     >>       >     >
     >>       >     > 3) <mcoy>
     >>       >     > "Dammit Jim, I don't want to learn/write Fortran,C,C++ 
and MPI.
     >>       >     > I'm a (fill in  domain specific scientific/technical 
position)"
     >>       >     > </mcoy>
     >>       >     >
     >>       >     > My reply,"I agree and wish there was a better answer to 
that
     >>       > question.
     >>       >     > The computing industry has made great strides in HW with
     >>       >     > multi-core, clusters etc. Software tools have always 
lagged
     >>       >     > hardware. In the case of HPC it is a slow process and
     >>       >     > in HPC the whole programming "thing" is not as "easy" as
     >>       >     > it is in other sectors, warp drives and transporters
     >>       >     > take a little extra effort.
     >>       >     >
     >>       >     > 4) Then I suggest Julia, "I invite you to try Julia. It is
     >>       >     > easy to get started, fast, and can grow with you 
application."
     >>       >     > Then I might say, "In a way it is HPC BASIC, it you are 
old
     >>       >     > enough you will understand what I mean by that."
     >>       >     >
     >>       >     > The question with languages like Julia (or Chapel, etc) 
is:
     >>       >     >
     >>       >     >   "How much performance are you willing to give up for
     >>       > convenience?"
     >>       >     >
     >>       >     > The goal is to keep the programmer close to the problem 
at hand
     >>       >     > and away from the nuances of the underlying hardware. 
Obviously
     >>       >     > the more performance needed, the closer you need to get 
to the
     >>       > hardware.
     >>       >     > This decision goes beyond software tools, there are all 
kinds
     >>       >     > of cost/benefits that need to be considered. And, then 
there
     >>       >     > is IO ...
     >>       >     >
     >>       >     > --
     >>       >     > Doug
     >>       >     >
     >>       >     >
     >>       >     >
     >>       >     >
     >>       >     >
     >>       >     >
     >>       >     >
     >>       >     >> Regards,
     >>       >     >> Jonathan
     >>       >     >> -----Original Message-----
     >>       >     >> From: Beowulf <beowulf-boun...@beowulf.org> On Behalf Of 
Michael
     >>       > Di
     >>       >     > Domenico
     >>       >     >> Sent: 04 March 2019 17:39
     >>       >     >> Cc: Beowulf Mailing List <beowulf@beowulf.org>
     >>       >     >> Subject: Re: [Beowulf] Large amounts of data to store 
and process
     >>       > On
     >>       >     > Mon, Mar 4, 2019 at 8:18 AM Jonathan Aquilina
     >>       >     > <jaquil...@eagleeyet.net>
     >>       >     >> wrote:
     >>       >     >>> As previously mentioned we donÃƒÂ¢Ã‚â‚¬Ã‚â„¢t really 
need to have
     >>       > anything
     >>       >     >>> indexed
     >>       >     > so I am thinking flat files are the way to go my only 
concern is
     >>       > the
     >>       >     > performance of large flat files.
     >>       >     >> potentially, there are many factors in the work flow that
     >>       > ultimately
     >>       >     > influence the decision as others have pointed out.  my 
flat file
     >>       > example
     >>       >     > is only one, where we just repeatable blow through the 
files.
     >>       >     >>> Isnt that what HDFS is for to deal with large flat 
files.
     >>       >     >> large is relative.  256GB file isn't "large" anymore.  
i've pushed
     >>       > TB
     >>       >     > files through hadoop and run the terabyte sort benchmark, 
and yes it
     >>       > can
     >>       >     > be done in minutes (time-scale), but you need an 
astounding amount
     >>       > of
     >>       >     > hardware to do it (the last benchmark paper i saw, it was 
something
     >>       > 1000
     >>       >     > nodes).  you can accomplish the same feat using less and 
less
     >>       >     > complicated hardware/software
     >>       >     >> and if your dev's are willing to adapt to the hadoop 
ecosystem, you
     >>       > sunk
     >>       >     > right off the dock.
     >>       >     >> to get a more targeted answer from the numerous smart 
people on
     >>       > the
     >>       >     > list,
     >>       >     >> you'd need to open up the app and workflow to us.  
there's just too
     >>       > many
     >>       >     > variables _______________________________________________
     >>       >     >> Beowulf mailing list, Beowulf@beowulf.org sponsored by 
Penguin
     >>       > Computing
     >>       >     > To change your subscription (digest mode or unsubscribe) 
visit
     >>       >     >> http://www.beowulf.org/mailman/listinfo/beowulf
     >>       >     >> _______________________________________________
     >>       >     >> Beowulf mailing list, Beowulf@beowulf.org sponsored by 
Penguin
     >>       > Computing
     >>       >     > To change your subscription (digest mode or unsubscribe) 
visit
     >>       >     >> http://www.beowulf.org/mailman/listinfo/beowulf
     >>       >     >
     >>       >     >
     >>       >     > --
     >>       >     > Doug
     >>       >     >
     >>       >     >
     >>       >     >
     >>       >     >
     >>       >     > _______________________________________________
     >>       >     > Beowulf mailing list, Beowulf@beowulf.org sponsored by 
Penguin
     >>       > Computing
     >>       >     > To change your subscription (digest mode or unsubscribe) 
visit
     >>       >     > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
     >>       >     >
     >>       >
     >>       >
     >>       >     --
     >>       >     Doug
     >>       >
     >>       >
     >>       >
     >>       >
     >>
     >>
     >>       --
     >>       Doug
     >>
     >>
     >>
     >> _______________________________________________
     >> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
     >> Computing To change your subscription (digest mode or unsubscribe)
     >> visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
     > --
     > Prentice Bisbal
     > Lead Software Engineer
     > Princeton Plasma Physics Laboratory
     > https://www.pppl.gov
     >
     > _______________________________________________
     > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing 
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

--

     Prentice Bisbal
     Lead Software Engineer
     Princeton Plasma Physics Laboratory
     https://www.pppl.gov

--
Prentice Bisbal
Lead Software Engineer
Princeton Plasma Physics Laboratory
https://www.pppl.gov

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Re: [Beowulf] Large amounts of data to store and process

Reply via email to