Re: Asymmetric gigabit speeds at full load / OpenBSD 3.7

Frederic BRET Wed, 29 Jun 2005 11:00:08 -0700

Hi

just a few updates about our problem. I tried a different setup thistime : no iperf on the OpenBSD router, only ip forwarding :

I have 1 linux (kernel 2.6) as a load generator on each of 2 vlansrouted by the OpenBSD, no process on it except a top to monitorinterrupts and load level.

I modified iperf to generate very very small packets : I want this timeto determine the bottleneck on openBSD.

All I can do between the 2 linuxes and before an interrupt load of 100%on the router is about 140Kpps (-> 5MB/s only) :

[  4]  1.0- 2.0 sec  5.45 MBytes  45.7 Mbits/sec  0.009 ms    0/142862 (0%)
[  4]  2.0- 3.0 sec  5.45 MBytes  45.7 Mbits/sec  0.010 ms    0/142860 (0%)
[  4]  3.0- 4.0 sec  5.45 MBytes  45.7 Mbits/sec  0.009 ms    0/142841 (0%)

=> more packets/s from the sender would mean dropped packet on therouter and loss on the receiver.

And to demonstrate the maximum capabilities of the load generators, thesame mono-thread iperf test between the 2 linuxes on the same vlan(without router) :

[  5]  1.0- 2.0 sec  12.6 MBytes    105 Mbits/sec  0.003 ms    0/329213 (0%)
[  5]  2.0- 3.0 sec  12.6 MBytes    106 Mbits/sec  0.003 ms    0/329980 (0%)
[  5]  3.0- 4.0 sec  12.6 MBytes    106 Mbits/sec  0.003 ms    0/330488 (0%)

330kpps without loss and 1 busy CPU on the linux. But still more thantwice the packet rate through the router.

(And just for the fun, despite the announce of the capability of FreeBSDto route 1Mpps, our 5.3 on bi-opteron is only able to route ~140Kpps too)

So the conclusion may be that the BSD hardwares are limited by theability of their OS to manage interrupts properly...

What do you think about this  ?

Frederic

Frederic BRET wrote:

Hi all,
This is my first post to this list. I'm trying to understand why ourOpenBSD PF router is not able to cope correctly with needed gigabitspeeds....
I have two Dell 1750 single-Xeon 2.8GHz. The first is our productionrouter still under OpenBSD 3.4 beta with PF since 2 years, and thesecond one is a fresh OpenBSD 3.7 under Generic stock kernel. Theultimate goal beeing to build a CARP dual router with the 2 machines.
The problem is that none of the 2 machines is able to route at speedhigher than ~350mbit/s, even without PF which could slow things, what Idoubt of.
In order to validate the capacity of the server to cope withsimultaneous up/down gigabitstreams, I've done several tests
- First, validate the external test machine and the network.
Here is a simultaneous (-d) iperf TCP test between 2 Sun V40Z (SLES9with Broadcom 5703). Between them, there's a HP Procurve 2824 Gigabitswitch with full-duplex enabled and properly negotiated on all ports :ROOT:Linux:/opt/iperf2/bin > ./iperf -i 1 -c <Linux iperf serveraddress> -d -w 256k
../..
[  4]  0.0-10.0 sec  1.01 GBytes    864 Mbits/sec
[  5]  0.0-10.0 sec  1.01 GBytes    865 Mbits/sec
=> The network AND the V40Z are capable of symetric quasi full-duplexgigabit. OK
- This being said, I'll try to do the same thing between a V40Z and aDELL 1750 (OpenBSD 3.7 with Broadcom 5704)
First lets do a non-simultaneous (-r) TCP test between the V40Z and a 1750
ROOT:Linux:/opt/iperf2/bin > ./iperf -i 1 -c <OpenBSD iperf serveraddress> -r -w 256k
../..
[  4]  0.0-10.0 sec  1.09 GBytes    935 Mbits/sec
[  4]  0.0-10.0 sec  1.09 GBytes    938 Mbits/sec
=> More than 1GB are transfered in 10s in one way then in the other.Unidirectionnal bandwidth of 1Gbits/s is almost respected, no problem.
- Now lets try simultaneously (-d) between the V40Z and the DELL 1750like the first iperf test between the 2 linux boxes :ROOT:Linux:/opt/iperf2/bin > ./iperf -i 1 -c <OpenBSD iperf serveraddress> -d -w 256k
../..
[  4]  0.0-10.0 sec    403 MBytes    338 Mbits/sec
[  5]  0.0-10.0 sec  1.02 GBytes    876 Mbits/sec
=> The Openbsd box isn't able to receive more the ~330Mbits/s every timeI tried when it's at the same time speaking through the wire. It's aconstant comportment.
- Seeing this disorder, let's try UDP transfers in order to determinethe speed at which the problem begins. I set the Linux client to sendand receive ` 46Mbits/s :ROOT:Linux: > ./iperf -i 1 -w 256k -c <OpenBSD iperf server address> -b46M -d -u
On the OpenBSD we can see this :
ROOT:OpenBSD: > ./iperf -i 1 -s -u -w 256k
../..
[ ID] Interval Transfer Bandwidth Jitter Lost/TotalDatagrams[ 5] 0.0-10.0 sec 54.5 MBytes 45.7 Mbits/sec 0.264 ms 372/39217(0.95%)
[  7]  0.0-10.0 sec  55.0 MBytes  46.1 Mbits/sec  0.002 ms    0/39217 (0%)
=> We begin to lose inbound packets on the Openbsd as soon as 46MBits/swhile still sending outbound packets without problems.
Of course it's only a beginning, because that's what we have with astream of 800MBits/s :
ROOT:ob35bckp:/root/compile/iperf-1.7.0 > ./iperf -i 1 -s -u -w 256k
../..[ ID] Interval Transfer Bandwidth JitterLost/Total Datagrams
[  7]  0.0-10.0 sec   976 MBytes   819 Mbits/sec  0.013 ms    0/696200 (0%)
[ 5] 0.0-10.3 sec 79.4 MBytes 65.0 Mbits/sec 14.982 ms657633/714260 (92%)
Now it's dramatic and the loss of packets is 92% !....
I guess it's a problem caused by unavailable buffers for the networkcard. It's receiving network packets but has no place to put them...
There are a few elements to help thinking :

During the iperf test, here's what netstat is saying :
ROOT:OpenBSD: > netstat -m
1263 mbufs in use:
      1142 mbufs allocated to data
      3 mbufs allocated to packet headers
      118 mbufs allocated to socket names and addresses
627/670/6144 mbuf clusters in use (current/peak/max)
1676 Kbytes allocated to network (93% in use)

And now, here is what it's saying while idle :
ROOT:OpenBSD: >netstat -m
1033 mbufs in use:
      1027 mbufs allocated to data
      3 mbufs allocated to packet headers
      3 mbufs allocated to socket names and addresses
512/724/6144 mbuf clusters in use (current/peak/max)
1784 Kbytes allocated to network (71% in use)
The OpenBSD has plenty enough of memory to deal with thousands of PFstates since many years as you can see on the dmesg of the 3.7 one :
OpenBSD 3.7 (GENERIC) #50: Sun Mar 20 00:01:57 MST 2005
  [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Xeon(TM) CPU 2.80GHz ("GenuineIntel" 686-class) 2.79 GHz
cpu0:FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID
real mem  = 1073160192 (1048008K)
avail mem = 972750848 (949952K)
using 4278 buffers containing 53760000 bytes (52500K) of memory

Buffers for tcp and udp aren't stock any more:
ROOT:OpenBSD: > sysctl -a |grep spacenet.inet.tcp.recvspace=131072
net.inet.tcp.sendspace=131072
net.inet.udp.recvspace=131072
net.inet.udp.sendspace=131072
NBMCLUSTERS has disappeared, but what I guess beeing its successor(kern.maxclusters ?) is not able to do something good while increased.
According to Henning Brauer, I tried to change IFQ_MAXLEN insys/net/if.c. Unfortunately it didn't fix the problem. With the stableand current kernel sources I tried many values for IFQ_MAXLEN up to 2000without success.I still have drops as low as 46Mbits/s with UDP and accordingly the dropcounter from netstat -s increases :
      810148 datagrams received
      710297 dropped due to full socket buffers
Just another test in order to exclude the comportment of the bge whetherwe use jumbo or not : I have 2 interfaces on the Dell : bge0 (no IP butMTU 8000 to permit 1500 bytes frames on the vlan interface) on which Ihave 1 vlan interface and bge1 natively (untagged) configured on anotherlan. Whatever the interface the drops are the same...
bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 8000
       address: 00:0d:56:fd:58:cd
       media: Ethernet 1000baseT full-duplex
       status: active
       inet6 fe80::20d:56ff:fefd:58cd%bge0 prefixlen 64 scopeid 0x1
vlan10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
       address: 00:0d:56:fd:58:cd
       vlan: 10 parent interface: bge0
       inet6 fe80::20d:56ff:fefd:58cd%vlan10 prefixlen 64 scopeid 0x7
       inet 10.10.0.254 netmask 0xffff0000 broadcast 10.10.255.255

bge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
       address: 00:0d:56:fd:58:ce
       media: Ethernet autoselect (1000baseT full-duplex)
       status: active
       inet 10.1.0.254 netmask 0xffff0000 broadcast 10.1.255.255
       inet6 fe80::20d:56ff:fefd:58ce%bge1 prefixlen 64 scopeid 0x2
The fact is the card has no problem to receive a gigabit stream as soosas it isn't speaking at the same time...
That said, I really don't know what to do any more......
Has any of you already encountered and solved this problem before ? I'mpretty sure I'm not the only one to have this.....
Thanks in advance !!

Frederic



--
__________________________________________________

Frederic BRET - Universite de La RochelleCentre de Ressources Informatiques

Technoforum - Avenue Einstein     Tel : 0546458214
17042 La Rochelle Cedex - France  Fax : 0546458245
__________________________________________________

Re: Asymmetric gigabit speeds at full load / OpenBSD 3.7

Reply via email to