First, thanks to all who've responded. I've been looking a bit thins
morning and am trying to grok the results.
Joe Landman wrote:
Hi Gerry
Gerry Creager wrote:
History/background/description of the cluster
* 126 node Dell 1950 cluster with dual-quad core Xeons
* HP 5412zl switch for gigabit cluster backplane and 10GBE
interconnect to selected services (file server, etc)
* Gigabit interconnect
* Hand compiled 2.6.26 kernel
* bnx2 module loaded for the Broadcom onboard nics
* Switch, compute nodes, head node set to 9000 byte MTU
We have had *lots* of problems with Broadcom nics and jumbo frames. From
2.6.9 timeframe onwards.
Marvelous. I'd prefer to not have to back-rev if I can avoid it...
We're seeing the following error in WRF compiled with openMPI and the
PGI 7.2 compiler:
mca_btl_tcp_frag_send:writev failed with errno=104
While all nodes were accessible prior to the run and returned
appropriate "stuff" when queried with, eg., ssh and a command, two
nodes now return something like this:
[ge...@brazos SCOOP12km]$ ssh c0522
Received disconnect from 192.168.200.154: 2: Bad packet length 808464432.
Hmmm... sounds like a link tried re-negotiating. Can you get on via
serial/console and
My guess is that the driver wandered across memory boundaries. This
stinks of a buffer problem to me. Typically, after this happens, I
can't log into the node via any interface, nor on console. It requites
an ipmi or physical reboot.
r...@lightning:~# ethtool eth0
-bash-3.2# ethtool eth1
Settings for eth1:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: d
Link detected: yes
You might want to
ethtool eth0 autoneg off
to force it not to renegotiate its speed. Also, look at
-bash-3.2# ethtool -A eth1 autoneg off
autoneg unmodified, ignoring
no pause parameters changed, aborting
r...@lightning:~# ethtool -g eth0
-bash-3.2# ethtool -g eth1
Ring parameters for eth1:
Pre-set maximums:
RX: 1020
RX Mini: 0
RX Jumbo: 4080
TX: 255
Current hardware settings:
RX: 255
RX Mini: 0
RX Jumbo: 765
TX: 255
See if you can do something like
ethtool -G eth0 rx-jumbo 100
if you have zero jumbo ring rx entries.
Doesn't look like this requires much change.
Also, while I'm in the neighborhood, to respond to Mark's suggestions:
-bash-3.2# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
Hmmm Might be worth changing tcp segmentation here.
-bash-3.2# ethtool -S eth1
NIC statistics:
rx_bytes: 43454
rx_error_bytes: 0
tx_bytes: 51103
tx_error_bytes: 0
rx_ucast_packets: 231
rx_mcast_packets: 0
rx_bcast_packets: 329
tx_ucast_packets: 250
tx_mcast_packets: 0
tx_bcast_packets: 4
tx_mac_errors: 0
tx_carrier_errors: 0
rx_crc_errors: 0
rx_align_errors: 0
tx_single_collisions: 0
tx_multi_collisions: 0
tx_deferred: 0
tx_excess_collisions: 0
tx_late_collisions: 0
tx_total_collisions: 0
rx_fragments: 0
rx_jabbers: 0
rx_undersize_packets: 0
rx_oversize_packets: 0
rx_64_byte_packets: 365
rx_65_to_127_byte_packets: 166
rx_128_to_255_byte_packets: 20
rx_256_to_511_byte_packets: 7
rx_512_to_1023_byte_packets: 1
rx_1024_to_1522_byte_packets: 1
rx_1523_to_9022_byte_packets: 0
tx_64_byte_packets: 42
tx_65_to_127_byte_packets: 84
tx_128_to_255_byte_packets: 31
tx_256_to_511_byte_packets: 97
tx_512_to_1023_byte_packets: 0
tx_1024_to_1522_byte_packets: 0
tx_1523_to_9022_byte_packets: 0
rx_xon_frames: 0
rx_xoff_frames: 0
tx_xon_frames: 0
tx_xoff_frames: 0
rx_mac_ctrl_frames: 0
rx_filtered_packets: 60
rx_discards: 0
rx_fw_discards: 0
-bash-3.2# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:1E:C9:AC:27:FB
inet addr:192.168.200.154 Bcast:192.168.203.255
Mask:255.255.252.0
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:574 errors:0 dropped:0 overruns:0 frame:0
TX packets:265 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:44422 (43.3 KiB) TX bytes:54606 (53.3 KiB)
Interrupt:16 Memory:f4000000-f4012100
I'm stumped and looking for causes and solutions. Yeah, the WRF as
compiled did run before the change to Jumbos.
Do I reduce the size of the frames to something smaller, like 8800
bytes? 7500? 1500?
In the past I had heard that jumbo frames may work on Broadcom NICs
around 6000 byte length. We haven't tried this in a while ... YMMV.
I'm not completely out of ideas but stumped.
Thanks, gerry
--
Gerry Creager -- gerry.crea...@tamu.edu
Texas Mesonet -- AATLT, Texas A&M University
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf