Hi Jack,
Just a followup to the email below. I now saw what appears to be the same problem on RELENG_8, but on a different nic and with VLANs. So not sure if this is a general em problem, a problem specific to some em NICs, or a TSO problem in general. The issue seemed to be triggered when I added a new vlan based on

e...@pci0:14:0:0: class=0x020000 card=0x109a15d9 chip=0x109a8086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Intel PRO/1000 PL Network Adaptor (82573L)'
    class      = network
    subclass   = ethernet
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)

pci14: <ACPI PCI bus> on pcib5
em3: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x6000-0x601f mem 0xe8300000-0xe831ffff irq 17 at device 0.0 on pci14
em3: Using MSI interrupt
em3: [FILTER]
em3: Ethernet address: 00:30:48:9f:eb:81

em3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=2098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
        ether 00:30:48:9f:eb:81
        inet 10.255.255.254 netmask 0xfffffffc broadcast 10.255.255.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active

I had to disable tso, rxcsum and txsum in order to see the devices on the other side of the two vlans trunked off em3. Unfortunately, the other sides were switches 100km and 500km away so I didnt have any tcpdump capabilities to diagnose the issue. I had already created one vlan off this NIC and all was fine. A few weeks later, I added a new one and I could no longer telnet into the remote switches from the local machine.... But, I could telnet into the switches from machines not on the problem box. Hence, it would appear to be a general TSO issue no ? I disabled tso on the nic (I didnt disable net.inet.tcp.tso as I forgot about that).. Still nothing. I could always ping the remote devices, but no tcp services. I then remembered this issue from before, so I tried disabling tso on the NIC. Still nothing. Then I disabled rxcsum and txcsum and I could then telnet into the remote devices.

This newly observed issue was from a buildworld on Mon Jun 14 11:29:12 EDT 2010.

I will try and recreate the issue locally again to see if I can trigger the problem on demand. Any thoughts on what it might be ? Perhaps an issue specific to certain em nics ?

        ---Mike


At 04:31 PM 6/10/2010, Mike Tancsa wrote:
Hi Jack,
        I am seeing some issues on RELENG_7 with a specific em nic

e...@pci0:13:0:0: class=0x020000 card=0x108c15d9 chip=0x108c8086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
device = 'Intel Corporation 82573E Gigabit Ethernet Controller (Copper) (82573E)'
    class      = network
    subclass   = ethernet
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)

If I disable tso, I am not able to make a tcp connection into the host

eg
0[psbgate1]# ifconfig em2
em2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
        ether 00:30:48:9f:eb:80
        inet 192.168.128.200 netmask 0xfffffff0 broadcast 192.168.128.207
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
0[psbgate1]# ifconfig em2 -tso
0[psbgate1]#


Looking at the pcap, the checksum is bad on the syn-ack. If I re-enable tso, it seems to be ok

16:18:01.113297 IP (tos 0x10, ttl 64, id 6339, offset 0, flags [DF], proto TCP (6), length 60) 192.168.128.196.54172 > 192.168.128.200.22: S, cksum 0x4e79 (correct), 3313156149:3313156149(0) win 65535 <mss 1460,nop,wscale 3,sackOK,timestamp 3376174416 0> 16:18:01.123676 IP (tos 0x0, ttl 64, id 3311, offset 0, flags [DF], proto TCP (6), length 60) 192.168.128.200.22 > 192.168.128.196.54172: S, cksum 0x81c9 (incorrect (-> 0x51f2), 1373042663:1373042663(0) ack 3313156150 win 65535 <mss 1460,nop,wscale 3,sackOK,timestamp 1251567646 3376174416>


em2: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x5000-0x501f mem 0xe8200000-0xe821ffff irq 16 at device 0.0 on pci13
em2: Using MSI interrupt
em2: [FILTER]
em2: Ethernet address: 00:30:48:9f:eb:80
pcib5: <ACPI PCI-PCI bridge> irq 16 at device 28.5 on pci0
pci14: <ACPI PCI bus> on pcib5
em3: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x6000-0x601f mem 0xe8300000-0xe831ffff irq 17 at device 0.0 on pci14
em3: Using MSI interrupt
em3: [FILTER]
em3: Ethernet address: 00:30:48:9f:eb:81


Also there is still the issue with

http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052842.html

in RELENG_7 ?

        ---Mike


--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            [email protected]
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike

_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[email protected]"

--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            [email protected]
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike

_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[email protected]"

Reply via email to