Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Mark Hahn
to generate a Universal FNN. FNNs don't really shine until you have 3 or 4 NICs/HCAs per compute node. depends on costs. for instance, the marginal cost of a second IB port on a nic seems to usually be fairly small. for instance, if you have 36 nodes, 3x24pt switches is pretty neat for 1 ho

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Mark Hahn
This reminds me to ask about all the Xen questions Virtual machines (sans dynamic migration) seem to address the inverse of the problem that MPI and other computational clustering solutions address. Virtual machines assume that the hardware is vastly more worthy than the OS and application w

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Tim Mattox
Cool, FNN's are still being mentioned on the Beowulf mailing list... For those not familiar with the Flat Neighborhood Network (FNN) idea, check out this URL: http://aggregate.org/FNN/ For those who haven't played with our FNN generator cgi script, do try it out. Hank (my Ph.D. advisor) enhanced

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Nifty niftyompi Mitch
On Thu, Jul 24, 2008 at 06:39:00PM -0400, Mark Hahn wrote: ... > >> Your point about "most people don't need" is important! With large >> multi core, multiple socket systems external and internal bandwidth >> can be interesting to ponder. > > that makes it sound like inter-node networks in ge

Re: [Beowulf] Infiniband modular switches

2008-07-24 Thread Greg Lindahl
On Mon, Jul 14, 2008 at 01:42:07PM -0400, Patrick Geoffray wrote: > AlltoAll of large messages is not a useless synthetic benchmark IMHO. AlltoAll is a real thing used by real codes, but do keep in mind that there are many algorithms for AlltoAll with various message sizes and network topologies,

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Greg Lindahl
On Thu, Jul 24, 2008 at 08:14:43PM +0200, Jan Heichler wrote: > 1) most applications are latency driven - not bandwidth driven. As a guy who's a big fan of low latency, I had to say that this is not a good generalization. Some apps become latency or message-rate sensitive if you scale to enough n

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Patrick Geoffray
Hi Mark, Mark Hahn wrote: With any network you need to avoid like the plauge any kind of loop, they can cause weird problems and are pretty much unnessasary. for well, I don't think that's true - the most I'd say is that given It is kind of true for wormhole switches, you can deadlock if you

Re: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband?

2008-07-24 Thread Greg Lindahl
On Thu, Jul 24, 2008 at 09:03:34AM -0400, Kyle Spaans wrote: > On Thu, Jul 24, 2008 at 07:33:02AM -0500, Gerry Creager wrote: > > My next home will have multiple fiber pairs to high-use rooms, plus > > convenience wireless. I don't intend to pull copper through the walls. > > Sorry, but won't yo

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Patrick Geoffray
Hi Jan, Jan Heichler wrote: 1) most applications are latency driven - not bandwidth driven. That means that half bisectional bandwidth is not cutting your application performance down to 50%. For most applications the impact should be less than 5% - for some it is really 0%. If the app is pu

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Mark Hahn
If you need 144 ports a single switch will be be more cost effective you'd think so - larger switches let you factor out lots of separate little power supplies, etc. not to mention transforming lots of cables into compact, reliable, cheap backplanes. but I haven't seen chassis switches actual

[Beowulf] Vi _and_ emacs on the Gentoo Clustering LievCD

2008-07-24 Thread Eric Thibodeau
Yeah...you read that right, I'll put BOTH on the CD...incredible but true! Now, you emacs users out there, take your pick, which of these do you want: kyron ldap-auth # eix emacs -c [U] app-admin/eselect-emacs ([EMAIL PROTECTED]/14/2008 -> 1.5): Manages Emacs versions [N] app-editors/emacs (21.4-

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Nifty niftyompi Mitch
On Thu, Jul 24, 2008 at 07:27:56PM +0100, andrew holway wrote: > Sender: [EMAIL PROTECTED] > > > Your most cost effective solution will be a large port count switch. > > Most are not 'ideal' but they are close to ideal and cost effective. > > That is not really the case in practice; > > You can

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread andrew holway
> Your most cost effective solution will be a large port count switch. > Most are not 'ideal' but they are close to ideal and cost effective. That is not really the case in practice; You can buy a Mellanox 144-Port Modular InfiniBand DDR Switch (60-Ports enabled) for around 22k EUR or so the 24

Re: Re[2]: [Beowulf] How to configure a cluster network

2008-07-24 Thread andrew holway
:) me and jan work together at ClusterVision. On Thu, Jul 24, 2008 at 7:14 PM, Jan Heichler <[EMAIL PROTECTED]> wrote: > Hallo Daniel, > > Donnerstag, 24. Juli 2008, meintest Du: > > [network configurations] > > I have to say i am not sure that all the configs you sketched really work. I > never s

Re[2]: [Beowulf] How to configure a cluster network

2008-07-24 Thread Jan Heichler
Hallo Daniel, Donnerstag, 24. Juli 2008, meintest Du: [network configurations] I have to say i am not sure that all the configs you sketched really work. I never saw somebody creating loops in an IB fabric. DP> Since I am not network expert I would be glad if somebody explains DP> why the fir

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread andrew holway
Well the top configuration(and the one that I suggested) is the one that we have tested and know works. We have implimented it into hundereds of clusters. It also provides redundancy for the core switches. With any network you need to avoid like the plauge any kind of loop, they can cause weird pr

Re: [Beowulf] Drive screw fixed with LocTite

2008-07-24 Thread David Mathog
Mark Kosmowski wrote: > Blue Loctite is removable with just a little more force than needed > with mechanical lock washers. It is critical to get a good, solid fit > with the tool to the bolt / screw though. Were you using the correct > driver or just one that fit "good enough"? It was the righ

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Nifty niftyompi Mitch
On Thu, Jul 24, 2008 at 09:42:57AM -0700, Kilian CAVALOTTI wrote: > On Thursday 24 July 2008 05:42:22 am andrew holway wrote: > > To give a half bisectional bandwidth the best approach is to set up > > two as core switches and the other 4 as edge switches. > > > > Each edge switch will have four co

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Daniel Pfenniger
Andrew, Here are joined some possible topologies I was contemplating, with some remarks about them. Many other topologies are possible. The first one is the one you mention. If 12 nodes linked to one switch communicate with 12 nodes on another switch the bandwidth is reduced to 8/12 = 2/3.

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread Kilian CAVALOTTI
On Thursday 24 July 2008 05:42:22 am andrew holway wrote: > To give a half bisectional bandwidth the best approach is to set up > two as core switches and the other 4 as edge switches. > > Each edge switch will have four connections to each core switch > leaving 16 node connections on each edge swi

Re: [Beowulf] problem with binary file on NFS

2008-07-24 Thread Michael H. Frese
Jorg, It might be that the executable is corrupted by NFS during delivery to that node. Once that happens, the cached copy can stay bad. You can check it by comparing md5sum results on that node and on the node that owns the original. There's a thread back in December of last year titled

Re: [Beowulf] Drive screw fixed with LocTite

2008-07-24 Thread Mark Kosmowski
> "David Mathog" <[EMAIL PROTECTED]> writes: >> A vendor who shall remain nameless graced us with a hot swappable drive >> caddy in which one of the three mounting screws used to fasten the drive >> to the caddy had been treated with blue LocTite. This wasn't obvious >> from external inspection, b

Re: [Beowulf] Drive screw fixed with LocTite

2008-07-24 Thread Peter St. John
Perhaps the hole for that screw was defective (e.g., somebody tried to screw in a screw one size too big) and rather than replace the case, he covered his mistake with the loctite. I'm still mistified by the hard drive soldered into the case of a brand-name computer, the trend to preventing user m

Re: [Beowulf] problem with binary file on NFS

2008-07-24 Thread Peter St. John
Jorg, I checked the man page for ldd and it says that it may not work if an old comiler was used to produced the executable. I think it's like symbolic debugging, you need to compile with a switch to build the symbol table; the compiler has to know you will want library information later, and buil

Re: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband?

2008-07-24 Thread Kyle Spaans
On Thu, Jul 24, 2008 at 07:33:02AM -0500, Gerry Creager wrote: > My next home will have multiple fiber pairs to high-use rooms, plus > convenience wireless. I don't intend to pull copper through the walls. Sorry, but won't you still have to pull fiber through the walls? Is fiber getting close e

Re: [Beowulf] How to configure a cluster network

2008-07-24 Thread andrew holway
Daniel To give a half bisectional bandwidth the best approach is to set up two as core switches and the other 4 as edge switches. Each edge switch will have four connections to each core switch leaving 16 node connections on each edge switch. Should provide a 64 port network. Make sense? Ta A

Re: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband?

2008-07-24 Thread Gerry Creager
My next home will have multiple fiber pairs to high-use rooms, plus convenience wireless. I don't intend to pull copper through the walls. I plan to put switches in rooms that need multi-drop and have at least one pair of fiber for high-speed access for a server, NAS, or cluster leading back

[Beowulf] How to configure a cluster network

2008-07-24 Thread Daniel Pfenniger
Hi, I have the problem of connecting with InfiniBand 50 1-HCA nodes with 6 24-port switches. Several configurations may be imagined, but which one is the best? What is the general method to solve such a problem? Thanks, Dan ___ Beowulf

Re: [Beowulf] Re: Religious wars

2008-07-24 Thread Bob Drzyzgula
On Wed, Jul 23, 2008 at 07:23:00PM -0400, Robert G. Brown wrote: > > On Wed, 23 Jul 2008, Bob Drzyzgula wrote: > >> vi back then was little more than a shell on ed IIRC >>> >>> It was (for nvi, is) the visual mode of ex, which is/was an extended >>> line editor in the lineage of ed, kind of a

Re: [Beowulf] Re: Religious wars

2008-07-24 Thread John Hearns
On Wed, 2008-07-23 at 22:54 -0400, Bob Drzyzgula wrote: > On Wed, Jul 23, 2008 at 09:06:03PM -0400, Perry E. Metzger wrote: > > > > "Robert G. Brown" <[EMAIL PROTECTED]> writes: > > > Note that Bob and I started out on systems with far less than 100 MB > > > of DISK and perhaps a MB of system memo