Apparently bad checksums in tcpdump/wireshark captures for outgoing
traffic are normal, when hardware offload is enabled.  The NIC only
fills in the correct value after data has been copied to the NIC, which
is after capture takes a snapshot of the buffer with whatever garbage
was there.  See the last point in http://docs.gz.ro/node/282.

Since disabling chksum offload actually fixed your problem, maybe your
NIC really was filling in checksums incorrectly (or it wasn't really
enabled, so neither the kernel nor the NIC were generating correct
checksums).

Anyway, I think to properly debug this, you should have tried capturing
packets from a different computer, so you can be sure you're seeing what
went out over the wire with/without HW chksum enabled.

TL:DR: you probably had a real issue (since disabling offload got your
ssh working), but your debugging / verification technique was flawed,
and would show a problem even on working hardware.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-lts-trusty in Ubuntu.
https://bugs.launchpad.net/bugs/1251464

Title:
  Bad TCP/UDP checksum on e1000e with tx-checksumming on

Status in linux-lts-trusty package in Ubuntu:
  Incomplete

Bug description:
  This machine is configured as a KVM virtual hosting machine running
  Ubuntu 12.04.3 LTS (3.5.0-41-generic) with multiple bridged ethernet
  ports. I found this issue on the eth1/br1 interface while configuring
  nagios3 plugins. I discovered that I could not connect to a mysql
  database server for checking by nagios3 although I could ping it from
  the same box. Further investigation showed I also could not traceroute
  to it or ssh to it from this machine but could from any other physical
  box we have.

  Digging deeper I discovered that outbound TCP/UDP packets from this
  machine had incorrect checksums.

  Example captured with tcpdump of an attempted ssh connection after
  'ethtool -k br1 on' (the connection never completes):

  14:47:44.923128 IP (tos 0x0, ttl  62, id 843, offset 0, flags [DF],
  proto: TCP (6), length: 60) 10.32.1.9.37409 > 10.96.0.10.ssh: S, cksum
  0x15c1 (incorrect (-> 0x2252), 1028975122:1028975122(0) win 14600 <mss
  1460,sackOK,timestamp 874244441 0,nop,wscale 7>

  then after running 'ethtool -k br1 tx off'  I got this capture (and
  the SSH connection completed this time):

  14:50:31.061165 IP (tos 0x0, ttl  62, id 48708, offset 0, flags [DF],
  proto: TCP (6), length: 60) 10.32.1.9.37412 > 10.96.0.10.ssh: S, cksum
  0x0435 (correct), 1835349468:1835349468(0) win 14600 <mss
  1460,sackOK,timestamp 874285976 0,nop,wscale 7>

  So disabling TX checksum offloading clearly worked around the problem.

  Testing also showed that the problem persists into hosted Ubuntu
  virtual machines unless they also have tx-checksumming turned off
  individually. The problem did not appear to affect any of my hosted
  CentOS5 virtual machines (running kernel 2.6.18-348.16.1.el5) - only
  my Ubuntu  virtual hosts (Ubuntu 9.10 running  2.6.31-23-server and
  Ubuntu 12.03.3 LTS running both 3.8.0-33-generic and
  3.5.0-43-generic).

  lsb_release -rd
  Description:  Ubuntu 12.04.3 LTS
  Release:      12.04

  Expected behavior: Correct TCP/UDP checksums when using TX checksum
  offloading.

  What happened: Incorrect TCP/UDP checksum when using TX checksum
  offloading.

  ProblemType: Bug
  DistroRelease: Ubuntu 12.04
  Package: linux-image-3.5.0-41-generic 3.5.0-41.64~precise1
  ProcVersionSignature: Ubuntu 3.5.0-41.64~precise1-generic 3.5.7.21
  Uname: Linux 3.5.0-41-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---T 1 root audio 116,  1 Oct  5 04:15 seq
   crw-rw---T 1 root audio 116, 33 Oct  5 04:15 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.0.1-0ubuntu17.5
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory
  Date: Thu Nov 14 14:23:17 2013
  HibernationDevice: RESUME=UUID=3d0cdb3e-8b12-4a48-88ca-0b446014db10
  InstallationMedia: Ubuntu-Server 12.04.2 LTS "Precise Pangolin" - Release 
amd64 (20130214)
  Lsusb:
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
   Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
   Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
   Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
  MachineType: Supermicro X7DB8
  MarkForUpload: True
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 radeondrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.5.0-41-generic 
root=/dev/mapper/pbox9d0-pbox9root ro
  RelatedPackageVersions:
   linux-restricted-modules-3.5.0-41-generic N/A
   linux-backports-modules-3.5.0-41-generic  N/A
   linux-firmware                            1.79.6
  RfKill: Error: [Errno 2] No such file or directory
  SourcePackage: linux-lts-quantal
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 08/13/2007
  dmi.bios.vendor: Phoenix Technologies LTD
  dmi.bios.version: 6.00
  dmi.board.name: X7DB8
  dmi.board.vendor: Supermicro
  dmi.board.version: PCB Version
  dmi.chassis.type: 1
  dmi.chassis.vendor: Supermicro
  dmi.chassis.version: 0123456789
  dmi.modalias: 
dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd08/13/2007:svnSupermicro:pnX7DB8:pvr0123456789:rvnSupermicro:rnX7DB8:rvrPCBVersion:cvnSupermicro:ct1:cvr0123456789:
  dmi.product.name: X7DB8
  dmi.product.version: 0123456789
  dmi.sys.vendor: Supermicro

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-lts-trusty/+bug/1251464/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to