This is not a problem in your setup as you are assigning a whole node together. In general how one can deal with problem of binding a particular gpu device to scheduler?
Sorry if I am asking something which is already known and there are ways to bind the devices within scheduler. Thanks, TV -----Original Message----- From: beowulf-boun...@beowulf.org [mailto:beowulf-boun...@beowulf.org] On Behalf Of Michael Di Domenico Sent: Thursday, January 28, 2010 9:54 AM To: Beowulf Mailing List Subject: Re: [Beowulf] GPU Beowulf Clusters The way I do it is, but your mileage may vary... We allocate two CPU's per GPU and use the Nvidia Tesla S1070 1U chassis product. So a standard quad/core - dual/socket server with four GPU's attached We've found that even though you expect the GPU to do most of the work, it really takes a CPU to drive the GPU and keep it busy Having a second CPU to pre-stage/post-stage the memory has worked pretty well also. For scheduling, we use SLURM and allocate one entire node per job, no sharing On Thu, Jan 28, 2010 at 12:38 PM, Jon Forrest <jlforr...@berkeley.edu> wrote: > I'm about to spend ~$20K on a new cluster > that will be a proof-of-concept for doing > GPU-based computing in one of the research > groups here. > > A GPU cluster is different from a traditional > HPC cluster in several ways: > > 1) The CPU speed and number of cores are not > that important because most of the computing will > be done inside the GPU. > > 2) Serious GPU boards are large enough that > they don't easily fit into standard 1U pizza > boxes. Plus, they require more power than the > standard power supplies in such boxes can > provide. I'm not familiar with the boxes > that therefore should be used in a GPU cluster. > > 3) Ideally, I'd like to put more than one GPU > card in each computer node, but then I hit the > issues in #2 even harder. > > 4) Assuming that a GPU can't be "time shared", > this means that I'll have to set up my batch > engine to treat the GPU as a non-sharable resource. > This means that I'll only be able to run as many > jobs on a compute node as I have GPUs. This also means > that it would be wasteful to put CPUs in a compute > node with more cores than the number GPUs in the > node. (This is assuming that the jobs don't do > anything parallel on the CPUs - only on the GPUs). > Even if GPUs can be time shared, given the expense > of copying between main memory and GPU memory, > sharing GPUs among several processes will degrade > performance. > > Are there any other issues I'm leaving out? > > Cordially, > -- > Jon Forrest > Research Computing Support > College of Chemistry > 173 Tan Hall > University of California Berkeley > Berkeley, CA > 94720-1460 > 510-643-1032 > jlforr...@berkeley.edu > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf