Hi, All,

I've been re-evaluating our existing InfiniBand fabric design for our HPC 
systems since I've been tasked with determining how we will add more systems in 
the future as more and more researchers opt to add capacity to our central 
system.  We've already gotten to the point where we've used up all available 
ports on the 144 port SilverStorm 9120 chassis that we have and we need to 
expand capacity.  One option that we've been floating around -- that I'm not 
particularly fond of, btw -- is to purchase a second chassis and link them 
together over 24 ports, two per spline.  While a good deal of our workload 
would be ok with 5:1 blocking and 6 hops (3 across each chassis), I've 
determined that, for the money, we're definitely not getting the best solution.

The plan that I've put together involves using the SilverStorm as the core in a 
spine-leaf design.  We'll go ahead and purchase a batch of 24 port QDR 
switches, two for each rack, to connect our 156 existing nodes (with up to 50 
additional on the way).  Each leaf will have 6 links back to the spine for 3:1 
blocking and 5 hops (2 for the leafs, 3 for the spine).  This will allow us to 
scale the fabric out to 432 total nodes before having to purchase another spine 
switch.  At that point, half of the six uplinks will go to the first spine, 
half to the second.  In theory, it looks like we can scale this design -- with 
future plans to migrate to a 288 port chassis -- to quite a large number of 
nodes.  Also, just to address this up front, we have a very generic workload, 
with a mix of md, abinitio, cfd, fem, blast, rf, etc.

If the good folks on this list would be kind enough to give me your input 
regarding these options or possibly propose a third (or forth) option, I'd very 
much appreciate it.  

Thanks in advance,

Brian Smith
Sr. HPC Systems Administrator
IT Research Computing, University of South Florida
4202 E. Fowler Ave. ENB308 
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to