Hello, I used to be in HPC back when we built beowulf clusters by hand ;) and wrote code in C/pthreads, PVM and MPI and back when anyone could walk into fields like bioinformatics, all that was needed was a pulse, some C and Perl and a desire to do ;-). Then I left for the private sector and stumbled into "big data" some years later - I wrote a lot of code in Spark and Scala, worked in infrastructure to support it etc.
Then I went back (in 2017) to HPC. I was surprised to find that not much has changed - researchers and grad students still write code in MPI and C/C++ and maybe some Python or R for visualization or localized data analytics. I also noticed that it was not easy to "marry" things like big data with HPC clusters - tools like Spark/Hadoop do not really have the same underlying infrastructure assumptions as do things like MPI/supercomputers. However, I find it wasteful for a university to run separate clusters to support a data science/big data load vs traditional HPC. I then stumbled upon languages like Julia - I like its approach, code is data, visualization is easy, decent ML/DS tooling. How does it fare on a traditional HCP cluster? Are people using it to substitute their MPI loads? On the opposite side, has it caught up to Spark in terms of DS/ML quality of offering? In other words, can it be used as a one fell swoop unifying substitute for both opposing approaches? I realize that many people have already committed to certain tech/paradigms but this is mostly educational debt (if MPI or Spark on the other side is working for me, why go to something different?) - but is there anything substantial stopping new people with no debt starting out in a different approach (offerings like Julia)? I do not have too much experience with Julia (and hence may be barking at the wrong tree) - in that case I am wondering what people are doing to "marry" the loads of traditional HPC with "big data" as practiced by the commercial/industry entities on a single underlying hardware offering. I know there are things like Twister2 but it is unclear to me (from cursory examination) what it actually offers in the context of my questions above. Any input, corrections, schooling me etc. are appreciated. Thank you!
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf