Doug, thank you for taking the time! Your Julia comments are in line with my impression of it, hence the initial question I posed in this thread. Thank you for all your insights.
On Tue, Oct 13, 2020 at 5:03 PM Douglas Eadline <[email protected]> wrote: > > > On Tue, Oct 13, 2020 at 3:54 PM Douglas Eadline <[email protected]> > > wrote: > > > >> > >> It really depends on what you need to do with Hadoop or Spark. > >> IMO many organizations don't have enough data to justify > >> standing up a 16-24 node cluster system with a PB of HDFS. > >> > > > > Excellent. If I understand what you are saying, there is simply no demand > > to mix technologies, esp. in the academic world. OK. In your opinion and > > independent of Spark/HDFS discussion, why are we still only on openMPI in > > the world of writing distributed code on HPC clusters? Why is there > > nothing > > else gaining any significant traction? No innovation in exposing higher > > level abstractions and hiding the details and making it easier to write > > correct code that is easier to reason about and does not burden the > writer > > with too much of a low level detail. Is it just the amount of investment > > in > > an existing knowledge base? Is it that there is nothing out there to > > compel > > people to spend the time on it to learn it? Or is there nothing there? Or > > maybe there is and I am just blissfully unaware? :) > > > > > I have been involved in HPC and parallel computing since the 1980's > Prior to MPI every vendor had a message passing library. Initially > PVM (Parallel Virtual Machine) from Oak Ridge was developed so there > would be some standard API to create parallel codes. It worked well > but needed more. MPI was developed so parallel hardware vendors > (not many back then) could standardize on a messaging framework > for HPC. Since then, not a lot has pushed the needle forward. > > Of course there are things like OpenMP, but these are not distributed > tools. > > Another issue the difference between "concurrent code" and > parallel execution. Not everything that is concurrent needs > to be executed in parallel and indeed, depending on > the hardware environment you are targeting, these decisions > may change. And, it is not something you can figure out by > looking at the code. > P > arallel computing is hard problem and no one has > really come up with a general purpose way to write software. > MPI works, however I still consider it a "parallel machine code" > that requires some careful programming. > > The good news is most of the popular HPC applications > have been ported and will run using MPI (as best as their algorithm > allows) So from an end user perspective, most everything > works. Of course there could be more applications ported > to MPI but it all depends. Maybe end users can get enough > performance with a CUDA version and some GPUs or an > OpenMP version on a 64-core server. > > Thus the incentive is not really there. There is no huge financial > push behind HPC software tools like there is with data analytics. > > Personally, I like Julia and believe it is the best new language > to enter technical computing. One of the issues it addresses is > the two language problem. The first cut of something is often written > in Python, then if it get to production and is slow and does > not have an easy parallel pathway (local multi-core or distributed) > Then the code is rewritten in C/C++ or Fortran with MPI, CUDA, OpenMP > > Julia is fast out the box and provides a growth path for > parallel growth. One version with no need to rewrite. Plus, > it has something called "multiple dispatch" that provides > unprecedented code flexibility and portability. (too long a > discussion for this email) Basically it keeps the end user closer > to their "problem" and further away from the hardware minutia. > > That is enough for now. I'm sure others have opinions worth > hearing. > > > -- > Doug > > > > > Thanks! > > > > > -- > Doug > >
_______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
