1.  Yes and no..  The application process needs to be "parallel aware", but 
for some applications that could just mean running multiple instances, one on 
each node, and farming the work out to them. This is called "embarassingly 
parallel" (EP).. A good example would be rendering animation frames.  Typically 
each frame doesn't depend on the frames around it so you can just parcel the 
work at a frame granularity to the nodes.    There are other applications which 
are more tightly coupled and where the computation process running on node N 
needs to know something about what's running on Node N+1 and Node N-1 very 
frequently.   For this, applications use some sort of standardized process 
communication library (e.g. MPI), or, perhaps a library that performs a high 
level function (e.g. Matrix inversion) that underneath uses the interprocess 
comm.

2.  Another "it depends". If the process is EP, and each node is processing a 
different image, then your problem is one of sending and retrieving images, 
which isn't much different from a conventional file server kind of model.  If 
multiple processors/nodes are working on the same image, then the interconnect 
might be more important.  It all depends on the communication requirements.     
Note that even EP applications can get themselves fouled up in network traffic 
(imagine booting 1000 nodes simultaneously, with them all wanting to fetch the 
boot image from one server simultaneously)


This is the place to ask..


From: CJ O'Reilly <supa...@gmail.com<mailto:supa...@gmail.com>>
Date: Wednesday, October 31, 2012 11:31 PM
To: "beowulf@beowulf.org<mailto:beowulf@beowulf.org>" 
<beowulf@beowulf.org<mailto:beowulf@beowulf.org>>
Subject: [Beowulf] Digital Image Processing via HPC/Cluster/Beowulf - Basics

Hello, I hope that this is a suitable place to ask this, if not, I would 
equally appreciate some advice on where to look in lue of answers to my 
questions:
You may guess that I'm very new to this subject.

I am currently researching the feasibility and process of establishing a 
relatively small HPC cluster to speed up the processing of large amounts of 
digital images.

After looking at a few HPC computing software solutions listed on the Wikipedia 
comparison of cluster software page ( 
http://en.wikipedia.org/wiki/Comparison_of_cluster_software ) I still have only 
a rough understanding of how the whole system works.

I have a few questions:
1. Do programs you wish to use via HPC platforms need to be written to support 
HPC, and further, to support specific middleware using parallel programming or 
something like that?
OR
Can you run any program on top of the HPC cluster and have it's workload 
effectively distributed? --> How can this be done?
2. For something like digital image processing, where a huge amount of 
relatively large images (14MB each) are being processed, will network speed, or 
processing power be more of a limiting factor? Or would a gigabit network 
suffice?
3. For a relatively easy HPC platform what would you recommend?

Again, I hope this is an ok place to ask such a question, if not please help 
refer me to a more suitable source.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to