Re: [Beowulf] [External] Spark, Julia, OpenMPI etc. - all in one place

Michael Di Domenico Tue, 13 Oct 2020 06:11:21 -0700

On Tue, Oct 13, 2020 at 8:39 AM Oddo Da <oddodao...@gmail.com> wrote:
>
> Michael, thank you for the insight. I think Hadoop in general is mostly 
> dying, Spark is really the derivative that took off. Basically, what you are 
> saying is that there is no demand on your infra for this kind of work. Do you 
> have any insights as to why not? Do the AI/DS/ML guys just know that they 
> cannot use your resources to run standard loads and go straight to the cloud 
> or local ethernet clusters?


Part of the reason it didn't take off is because we're just not a
bigdata shop and doing math inside the hadoop world was hard.  some of
what we do does revolve around parsing through large swaths of data,
but then after that 'grep' is done the users wanted to do some complex
math on the data, but hadoop/java didn't have the right libraries or
people had to learn java, which they weren't willing to do.  the
abstraction languages like pig, (and others i forget the names), made
things a little easier, but overall it was just too complicated.  and
frankly i think this is exactly what you're seeing.  outside of
'industry' aka 'internet world' the 'hadoop architecture' really
doesn't have much utility and mpi and it's ilk really are better
suited.  whether julia can/should displace traditional C/mpi, who
knows.

> In your estimate, how many of your users write code in Julia vs MPI vs Python?

it varies.  mostly it depends on the person working on the project.  i
try to support everything across the entire center compute
infrastructure, but we leave it up to the user to figure out how best
to scale their program to the machines we have.  Even though we have
primarily a traditional HPC setup, there's nothing we can't run from
AI to CUDA to C/Fortran MPI or even just simple python programs.  We
still run 'bigdata' programs, it's just that the users have found
other ways to do it that don't require hadoop
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Re: [Beowulf] [External] Spark, Julia, OpenMPI etc. - all in one place

Reply via email to