Thanks, infoative: p I'll consider your advice. If i read correctly, it seems the answer to the question about programming was: yes, a program must be written to accommodate a cluster. Did i get you right? 在 2012-11-4 上午6:11,"Mark Hahn" <h...@mcmaster.ca>写道:
> I am currently researching the feasibility and process of establishing a >> relatively small HPC cluster to speed up the processing of large amounts >> of >> digital images. >> > > do you mean that smallness is a goal? or that you don't have a large > budget? > > After looking at a few HPC computing software solutions listed on the >> Wikipedia comparison of cluster software page ( >> http://en.wikipedia.org/wiki/**Comparison_of_cluster_software<http://en.wikipedia.org/wiki/Comparison_of_cluster_software>) >> I still have >> only a rough understanding of how the whole system works. >> > > there are several discrete functionalities: > - shared filesystem (if any) > - scheduling > - intra-job communication (if any; eg MPI) > - management/provisioning/**monitoring of nodes > > IMO, anyone who claims to have "best practices" in this field is lying. > there are particular components that have certain strengths, but none of > them are great, and none universally appropriate. (it's also common > to conflate or "integrate" the second and fourth items - for that matter, > monitoring is often separated from provisioning.) > > 1. Do programs you wish to use via HPC platforms need to be written to >> support HPC, and further, to support specific middleware using parallel >> programming or something like that? >> > > "middleware" is generally a term from the enterprise computing environment. > it basically means "get someone else to take responsibility for hard bits", > and is a form of the classic commercial best practice of CYA. from an HPC > perspective, there's the application and everything else. if you really > want, you can call the latter "middleware", but doing so is uninformative. > > HPC covers a lot of ground. usually, people mean jobs will execute in a > batch environment (started from a commandline/script). OTOH HPC sometimes > means what you might call "personal supercomputing", where an interactive > application runs in a usually-dedicated cluster (shared clusters tend to > have scheduling response times that make interactive use problematic.) > (shared clusters also give rise to the single most important value of > clusters: that they can interleave bursty demand. if everyone in your > department shares a cluster, it can be larger than any one group can > afford, and therefore all groups will be able to burst to higher capacity. > this is why large, shared clusters are so successful. and, for that > matter, > why cloud services are successful.) > > you can do HPC with very little overhead. you will generally want a shared > filesystem - potentially just a NAS box or existing server. you may not > bother with scheduling at all - let users pick which machine to run on, > for instance. that sounds crazy, but if you're the only one using it, why > bother with a scheduler? HPC can also be done without inter-job > communication - if your jobs are single-node serial or threaded, for > instance. and you may not need any sort of management/provisioning, > depending on the stability of your nodes, environment, expected lifetime, > etc. > > in short, slapping linux onto a few boxes, set up ssh keys or hostbased > trust, have one or more of them NFS out some space, and you're cooking. > > OR >> Can you run any program on top of the HPC cluster and have it's workload >> effectively distributed? --> How can this be done? >> > > this is a common newbie question. a naive program (probably serial or > perhaps > multithreaded) will see no benefit from a cluster. clusters are just plain > old machines. the benefit comes if you want throughput (jobs per time) or > specifically program for distributed computation (classically with MPI). > it's common to use infiniband to accelerate this kind of job (as well as > provide the fastest possible IO.) > > 2. For something like digital image processing, where a huge amount of >> relatively large images (14MB each) are being processed, will network >> > > the main question is how much work a node will be doing per image. > > suppose you had an infinitely fast fileserver and gigabit connected nodes: > transferring the image would take 10-15ms, so you would ideally spend > about the same amount of time processing an image. but in this case, you > should probably ask whether you can simply store images on the nodes in the > first place. if you haven't thought about where the inputs are and how > fast they > can be gotten, then that will probably be your bottleneck. > > speed, or processing power be more of a limiting factor? Or would a >> gigabit >> network suffice? >> > > how long does a prospective node take to complete one work unit, > and how long does it take to transfer the files for one? > your speedup will be limited by whatever resource saturates first > (possibly your fileserver.) > > 3. For a relatively easy HPC platform what would you recommend? >> > > they are all crap. you should try not to spend on crap you don't need, > but ultimately it depends on how much expertise you have and/or how much > you value your time. any idiot can build a cluster from scratch using > fundamental open-source components, eventually. but if said idiot has to > learn filesystems, scheduling, provisioning, etc from scratch, it could > take quite a while. when you buy, you are buying crap, but it's crap > that may save you some time. > > don't count on commercial support being more than crappy. > > you should probably consider using a cloud service - this is just > commercial > outsourcing - more crap, but perhaps of value if, for instance, you don't > want to get your hands dirty hosting machines (amazon), etc. > > anything commercial in this space tends to be expensive. the license to > cover a crappy scheduler for a few hundred nodes, for instance will be > pretty > close to an FTE-year. renting a node from a cloud provider for a year > costs > about as much as buying a new node each year, etc. > > Again, I hope this is an ok place to ask such a question, if not please >> > > this is the place. though there are some fringe sects of HPC who tend to > subsist on more and/or different crap (such as clusters running windows.) > beowulf tends towards the low-crap end of things (linux, open packages.) > > regards, mark hahn. >
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf