Public bug reported:

SRU Justification:

Impact:

Configuring the 4.8 kernel with iptables MASQUERADE over virtio_net
causes packets to be dropped by the hypervisor (host) due to improper
flags being set based on the IP checksum state of the packet.  The host
performing MASQUERADE is affected by the bug.

Issue was introduced by

commit fd2a0437dc33b6425cabf74cc7fc7fdba6d5903b
Author: Mike Rapoport <r...@linux.vnet.ibm.com>
Date: Wed Jun 8 16:09:18 2016 +0300

    virtio_net: introduce virtio_net_hdr_{from,to}_skb

which first appears in v4.8-rc1

Fix:

Fixed upstream by

3e9e40e74753 virtio_net: Simplify call sites for virtio_net_hdr_{from, 
to}_skb().
501db511397f virtio: don't set VIRTIO_NET_HDR_F_DATA_VALID on xmit
6391a4481ba0 virtio-net: restore VIRTIO_HDR_F_DATA_VALID on receiving

3e9e40e74753 first appears in v4.9-rc5 (and is a prerequisite only), the
others in v4.10-rc4.

Testcase:

Reproduction to date has been on GCE, although in principle it should
manifest on any suitable topology using virtio_net.  There is a
dependency on the forwarded packets having skb->ip_summed ==
CHECKSUM_UNNECESSARY; not all incoming devices will have this property.

On GCE, the following steps will induce the issue on an affected kernel:

Setup a network:

% gcloud compute networks create nat-network --mode legacy --range 10.240.0.0/16
% gcloud compute firewall-rules create nat-network-allow-ssh --allow tcp:22 
--network nat-network
% gcloud compute firewall-rules create nat-network-allow-internal --allow 
tcp:1-65535,udp:1-65535,icmp --source-ranges 10.240.0.0/16 --network nat-network

Setup an Ubuntu 16.04 NAT VM:

% gcloud compute instances create nat-gateway-16 --zone us-central1-a
--network nat-network --can-ip-forward --image-family ubuntu-1604-lts
--image-project ubuntu-os-cloud --tags nat --metadata startup-
script='sysctl -w net.ipv4.ip_forward=1 ; iptables -t nat -A POSTROUTING
-o ens4 -j MASQUERADE'

Setup a route to use the 16.04 NAT:

% gcloud compute routes create no-ip-internet-route --network nat-
network --destination-range 0.0.0.0/0 --next-hop-instance nat-gateway-16
--next-hop-instance-zone us-central1-a --tags no-ip --priority 800

Setup a simple test VM without any external network:

% gcloud compute instances create nat-client --zone us-central1-a
--network nat-network --no-address --image-family ubuntu-1604-lts
--image-project ubuntu-os-cloud --tags no-ip --metadata startup-
script='wget --timeout=5 https://github.com/GoogleCloudPlatform/compute-
image-packages/archive/20170327.tar.gz'

Wait for it to boot... maybe 30 seconds or so.

Look for serial port output:

% gcloud compute instances get-serial-port-output nat-client --zone us-
central1-a | grep startup-script

You will see that the connection to github never succeeds - it just gets
stuck on "Resolving github.com (github.com)... 192.30.253.112,
192.30.253.113" and will timeout. (ignore the previous attempt from the
successful 14.04 based NAT).

Repeat the test by resettting the test client instance and watch for
serial output:

% gcloud compute instances reset nat-client --zone us-central1-a

Wait a minute or so for new boot, then check the serial-port-output as
above.

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Jay Vosburgh (jvosburgh)
         Status: New

** Changed in: linux (Ubuntu)
     Assignee: (unassigned) => Jay Vosburgh (jvosburgh)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1683947

Title:
  ubuntu 4.8 kernel, virtio_net error causes NAT packets to be lost

Status in linux package in Ubuntu:
  New

Bug description:
  
  SRU Justification:

  Impact:

  Configuring the 4.8 kernel with iptables MASQUERADE over virtio_net
  causes packets to be dropped by the hypervisor (host) due to improper
  flags being set based on the IP checksum state of the packet.  The
  host performing MASQUERADE is affected by the bug.

  Issue was introduced by

  commit fd2a0437dc33b6425cabf74cc7fc7fdba6d5903b
  Author: Mike Rapoport <r...@linux.vnet.ibm.com>
  Date: Wed Jun 8 16:09:18 2016 +0300

      virtio_net: introduce virtio_net_hdr_{from,to}_skb

  which first appears in v4.8-rc1

  Fix:

  Fixed upstream by

  3e9e40e74753 virtio_net: Simplify call sites for virtio_net_hdr_{from, 
to}_skb().
  501db511397f virtio: don't set VIRTIO_NET_HDR_F_DATA_VALID on xmit
  6391a4481ba0 virtio-net: restore VIRTIO_HDR_F_DATA_VALID on receiving

  3e9e40e74753 first appears in v4.9-rc5 (and is a prerequisite only),
  the others in v4.10-rc4.

  Testcase:

  Reproduction to date has been on GCE, although in principle it should
  manifest on any suitable topology using virtio_net.  There is a
  dependency on the forwarded packets having skb->ip_summed ==
  CHECKSUM_UNNECESSARY; not all incoming devices will have this
  property.

  On GCE, the following steps will induce the issue on an affected
  kernel:

  Setup a network:

  % gcloud compute networks create nat-network --mode legacy --range 
10.240.0.0/16
  % gcloud compute firewall-rules create nat-network-allow-ssh --allow tcp:22 
--network nat-network
  % gcloud compute firewall-rules create nat-network-allow-internal --allow 
tcp:1-65535,udp:1-65535,icmp --source-ranges 10.240.0.0/16 --network nat-network

  Setup an Ubuntu 16.04 NAT VM:

  % gcloud compute instances create nat-gateway-16 --zone us-central1-a
  --network nat-network --can-ip-forward --image-family ubuntu-1604-lts
  --image-project ubuntu-os-cloud --tags nat --metadata startup-
  script='sysctl -w net.ipv4.ip_forward=1 ; iptables -t nat -A
  POSTROUTING -o ens4 -j MASQUERADE'

  Setup a route to use the 16.04 NAT:

  % gcloud compute routes create no-ip-internet-route --network nat-
  network --destination-range 0.0.0.0/0 --next-hop-instance nat-
  gateway-16 --next-hop-instance-zone us-central1-a --tags no-ip
  --priority 800

  Setup a simple test VM without any external network:

  % gcloud compute instances create nat-client --zone us-central1-a
  --network nat-network --no-address --image-family ubuntu-1604-lts
  --image-project ubuntu-os-cloud --tags no-ip --metadata startup-
  script='wget --timeout=5 https://github.com/GoogleCloudPlatform
  /compute-image-packages/archive/20170327.tar.gz'

  Wait for it to boot... maybe 30 seconds or so.

  Look for serial port output:

  % gcloud compute instances get-serial-port-output nat-client --zone
  us-central1-a | grep startup-script

  You will see that the connection to github never succeeds - it just
  gets stuck on "Resolving github.com (github.com)... 192.30.253.112,
  192.30.253.113" and will timeout. (ignore the previous attempt from
  the successful 14.04 based NAT).

  Repeat the test by resettting the test client instance and watch for
  serial output:

  % gcloud compute instances reset nat-client --zone us-central1-a

  Wait a minute or so for new boot, then check the serial-port-output as
  above.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1683947/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to