RE: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Tom Elken
> > Ah! So my "real" latencies are 140/2 = 70 microsecs. > > I see ping times of ~70μs between the nVidias I posted data on, and > they > have an MPI latency of ~12μs. If you want a measurement/benchmark for > MPI performance use IMB, like for the results I posted. In addition to Intel MPI Bench

[Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Dave Love
Rahul Nabar writes: > I thought so! :) Lazy person's first shot. Now I will try ethtool. It's not relevant with all NICs. Some use driver module parameters. > Any way to verify if I do? Consult the NIC's documentation. > Ah! So my "real" latencies are 140/2 = 70 microsecs. I see ping time

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Patrick Geoffray
Dave, Scott, Dave Love wrote: Scott Atchley writes: When I test Open-MX, I turn interrupt coalescing off. I run omx_pingpong to determine the lowest latency (LL). If the NIC's driver allows one to specify the interrupt value, I set it to LL-1. Note that it is only meaningful wrt ping-pon

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Patrick Geoffray
Dave Love wrote: That's something I haven't seen. However, I'm only using rx-frames=1 because simply adjusting rx-usec doesn't behave as expected. Instead of rx-usecs being the time between interrupts, it is sometimes implemented as the delay between the the first packet and the following in

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Dave Love
Scott Atchley writes: > As Patrick kindly pointed out, you are using rx-frames and not rx- > usec. They are not equivalent. That's something I haven't seen. However, I'm only using rx-frames=1 because simply adjusting rx-usec doesn't behave as expected. (It's documented, but perhaps only in t

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Dave Love
Scott Atchley writes: > That is odd. I have only tested with Intel e1000 and our myri10ge > Ethernet driver. The Intel driver does not let you specify value other > than certain settings (0, 25, etc.). I can't remember if I tried it, but it's documented to be adjustable in the range 100-1000

Re: [Beowulf] typical latencies for gigabit ethernet

2009-06-29 Thread Rahul Nabar
On Mon, Jun 29, 2009 at 3:15 PM, Mark Hahn wrote: Thanks for all the help Mark! > well, ping is not _really_ a benchmark ;) I thought so! :) Lazy person's first shot. Now I will try ethtool. > but it does sound like you have interrupt coalescing enabled. Any way to verify if I do? >  ping i

Re: [Beowulf] typical latencies for gigabit ethernet

2009-06-29 Thread Mark Hahn
Hmm...well I must be doing something terribly wrong then. Our latencies are in the 140 microseconds range (as revealed by ping) well, ping is not _really_ a benchmark ;) but it does sound like you have interrupt coalescing enabled. on our dl145g2 nodes (BCM95721), I can peel ~40 us off ping time

Re: [Beowulf] typical latencies for gigabit ethernet

2009-06-29 Thread Rahul Nabar
On Sat, Jun 27, 2009 at 5:21 PM, Mark Hahn wrote: > seems to be fairly variable.  let's say 50 +- 20 microseconds. > >> Setups is a simple: server<-->switch<-->server. > > it may be instructive to try a server-server test case. Hmm...well I must be doing something terribly wrong then. Our latencie

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Scott Atchley
On Jun 29, 2009, at 1:44 PM, Scott Atchley wrote: Right, and that's what I did before, with sensible results I thought. Repeating it now on Centos 5.2 and OpenSuSE 10.3, it doesn't behave sensibly, and I don't know what's different from the previous SuSE results apart, probably, from the minor k

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Scott Atchley
On Jun 29, 2009, at 12:10 PM, Dave Love wrote: When I test Open-MX, I turn interrupt coalescing off. I run omx_pingpong to determine the lowest latency (LL). If the NIC's driver allows one to specify the interrupt value, I set it to LL-1. Right, and that's what I did before, with sensible r

[Beowulf] Xeon Nehalem 5500 series (socket 1366) DP motherboard recommendations/experiences ...

2009-06-29 Thread richard . walsh
All, I am putting together a bill of materials for a small cluster based on the Xeon Nehalem 5500 series. What dual-socket motherboards (ATX and ATX-extended) are people happy with? Which ones should I avoid? Thanks much, Richard Walsh Thrashing River Computing

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Dave Love
Scott Atchley writes: > When I test Open-MX, I turn interrupt coalescing off. I run > omx_pingpong to determine the lowest latency (LL). If the NIC's driver > allows one to specify the interrupt value, I set it to LL-1. Right, and that's what I did before, with sensible results I thought. Re

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Dave Love
Gerry Creager writes: > I had rather nasty results with tg3 and abandoned it. We're using bnx2 > now. The latest iteration seems (guardedly) better than the last one. I thought that they were for different hardware (NetXtreme I c.f. NetXtreme II, according to broadcom.com). Is that not the c

Re: [Beowulf] Re: dedupe filesystem

2009-06-29 Thread Joe Landman
Gerry Creager wrote: Ob-Beowulf: You can run Venti on GNU/Linux,² but I don't know how the current implementation performs. Also, GlusterFS has a `data de-duplication translator' on its roadmap, which I didn't see mentioned. Our initial results with a GlusterFS implementation led us back to

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Scott Atchley
On Jun 29, 2009, at 5:59 AM, Dave Love wrote: Can you say something about any tuning you did to get decent results? To get the lowest latency, turn off rx interrupt coalescence, either with ethtool or module parameters, depending on the driver. Of course, you may not want to turn it off co

Re: [Beowulf] Re: dedupe filesystem

2009-06-29 Thread Gerry Creager
Dave Love wrote: Ashley Pittman writes: If you relied on the md5 sum alone there would be collisions and those collisions would result in you losing data. The question is whether the probability of collisions is high compared with other causes -- presumably hardware, assuming no-one puts fig

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Gerry Creager
Dave Love wrote: Gerry Creager writes: Can you say something about any tuning you did to get decent results? To get the lowest latency, turn off rx interrupt coalescence, either with ethtool or module parameters, depending on the driver. Of course, you may not want to turn it off completely

[Beowulf] Re: dedupe filesystem

2009-06-29 Thread Dave Love
Ashley Pittman writes: > If you relied on the md5 sum alone there would be collisions and those > collisions would result in you losing data. The question is whether the probability of collisions is high compared with other causes -- presumably hardware, assuming no-one puts figures on the softw

[Beowulf] Re: HPC fault tolerance using virtualization)

2009-06-29 Thread Dave Love
Greg Lindahl writes: >> What I typically see from smartd is alerts when one or more sectors has >> already gone bad, although that tends not to be something that will >> clobber the running job. How should it be configured to do better >> (without noise)? > > That isn't noise, that's signal. Of

Re: [Beowulf] Re: typical latencies for gigabit ethernet

2009-06-29 Thread Dave Love
Gerry Creager writes: > Can you say something about any tuning you did to get decent results? To get the lowest latency, turn off rx interrupt coalescence, either with ethtool or module parameters, depending on the driver. Of course, you may not want to turn it off completely, depending on how