Re: [Beowulf] Clustering vs Hadoop/spark

Jonathan Aquilina via Beowulf Tue, 24 Nov 2020 00:20:06 -0800

Hi Ben,

Readded the list

I think where im confused is that to me doesn’t that what Hadoop/Spark does 
distributes the data for computation then aggregates it back into a single data 
set?

Correct me if I am wrong here. 

Also another thing I cant seem to understand is how for big data analytics a 
java based platfrom manages to get some great performance to crunch large data 
sets.

Regards,
Jonathan

-----Original Message-----
From: Benjamin Redling <benjamin.ra...@uni-jena.de> 
Sent: 24 November 2020 09:03
To: Jonathan Aquilina <jaquil...@eagleeyet.net>
Subject: Re: [Beowulf] Clustering vs Hadoop/spark

Hello Jonathan,

On 24/11/2020 06.22, Jonathan Aquilina via Beowulf wrote:
> I am just wondering what advantages does setting up of a cluster have 
> in relation to big data analytics vs using something like Hadoop/spark?

can you distribute any application without programming against a framework?

We distribute a lot of data parallel tasks with the source code unchanged via 
SLURM.

Regards,
Benjamin
-- 
FSU Jena | JULIELab.de/Staff/Redling
☎  +49 3641 9 44323
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Re: [Beowulf] Clustering vs Hadoop/spark

Reply via email to