One thing to remember, a cluster is often defined by its environment and goals.
1. HPC type clusters, as discussed on this list, operate in a particular way (cluster nodes are provisioned in a reproducible way, job scheduler that provides dynamic resources to users, clusters are optimized for processing and communications speed) 2. Things like Kafka operate differently and I assume you are talking about creating Kafka Consumers in HPX for some kind of workflow. In general, something like Kafka manages it's own work and job distribution (even fault tolerance) and can span multiple data centers. I assume that you are more interested in situation #2. In this case, much of the machinery used in case #1 is not needed. However, running parallel HPX jobs will take some resource management and I don't know enough about Kafka to comment on how to do this. I assume you are setting a static single application based cluster environment. Provisioning is an other issue. Some of the HPC cluster provisioning tools my help get nodes setup easily and reproducibility (Google the Warewulf Toolkit) In the "analytics" world provisioning is very different where tools like Ambari are used, but that may be overkill for what you are trying to. And, of course, many of the distributed Apache analytics type tools have there own cluster install options and recipes. (e..g. setting a stand alone Spark cluster) Hope that helps a bit. -- Doug > Hi all, > > I'm developing an application which need to use tools and other > applications that excel in a distributed environment: > - HPX ( https://github.com/STEllAR-GROUP/hpx ) , > - Kafka ( http://kafka.apache.org/ ) > - a blockchain tool. > This is why I'm eager to learn how to deploy a beowulf cluster. > > I've read some info here: > - https://en.wikibooks.org/wiki/Building_a_Beowulf_Cluster > - https://www.linux.com/blog/building-beowulf-cluster-just-13-steps > - > https://www-users.cs.york.ac.uk/~mjf/pi_cluster/src/Building_a_simple_Beowulf_cluster.html > > And I have 2 starting questions in order to clarify how I should proceed > for a correct cluster building: > > 1) My starting point is a PC, I'm working with at the moment, with this > features: > - Corsair Simm Memoria RAM, DDR3, PC1600, 32GB, CL10 Ven k > - Intel Ci7 Box Processore CPU 1150 i7-4790K, 4.00 GHz > - Samsung MZ-76E500B Unità SSD Interna 860 EVO, 500 GB, 2.5" SATA III, > Nero/Grigio > - MB ASUS H97-PLUS > - lettore DVD-RW > > I'm using as OS Ubuntu 18.04.01 Server Edition. > > On one side I read that it should be better to put in the same cluster the > same type of HW : PCs of the same type, > but on the other side also hetherogeneous HW (server or PCs) can also be > deployed. > So....which HW should I take in consideration for the second node, if the > features of the very first "node" are the ones above? > > 2) I read that some software (Rocks, OSCAR) would make the cluster > configuration easier and smoother. But I also read that > using the same OS, > with the right same version, for all nodes, in my case Ubuntu 18.04.01 > Server Edition, could be a safe starter. > So... is it strictly necessary to use Rocks or OSCAR to correctly > configure > the nodes network? > > Looking forward to your kind hints and suggestions. > Marco > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Doug _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf