Re: [Beowulf] tcp error: Need ideas!

Joe Landman Sat, 24 Jan 2009 12:01:17 -0800

I wonder if the switch could be implicated. We have seen some (cheap)GbE switches not support (in practice) jumbo frames (irrespective ofliterature).


Nifty Tom Mitchell wrote:

On Sat, Jan 24, 2009 at 09:36:09AM -0600, Gerry Creager wrote:
Couple of follow-up notes.

MTU=4500:  Had one node fall over with the same overflow errors.
MTU=3000: A WRF model is running, but single timesteps are executing2.5x slower than MTU=1500


Segment offload?  Is TSO on or off?

        ethtool -k eth0

will tell you. You might also have one very reluctant machine, in thesense of being unwilling to switch their mtu. Could you do an


        ifconfig eth0 | grep MTU

on each machine and verify that everyone is using the right MTU?

I'll go snag the new driver and compile it.  After all: What can it hurt!

Thanks, Guy!

Regards, Gerry

Guy Coates wrote:
Hi,

We have also seen problems with the bnx2 drivers.

I got a more recent set of bnx2 drivers from Broadcom:
......

Has the data been snooped for this data to see if all
is as expected.

If you are seeing a natural MTU running faster than a jumbo MTU
then something is fragmenting or causing fragmentation of the data.
Should the MTU=4500 causes overflow errors it might be related to fragmentation.
Both the sender and receiver have to keep all the bits on a reliabletransfer until the data has been acknowledged. At one time fragmentation
could only be done once to a minimum MTU in the life of a packet.
In addition to snooping packets try "tracepath" to and from allthe involved boxes to discover what is going on.



--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: land...@scalableinformatics.com
web  : http://www.scalableinformatics.com
       http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] tcp error: Need ideas!

Reply via email to