I have two OpenBSD 4.6-stable boxes which I am having network
performance issues with while ALTQ is in use. This testing is being
performed while there is virtually no source of load on the boxes. I'm
looking to find out if this a limitation of ALTQ throughput, or
something with my configuration. I have tried changing the qlimit,
tbrsize, and from cbq to hfsc with no success.
The boxes use quad port em(4) network cards, on a PCIe 8x bus, and have
a relatively basic PF ruleset applied to them. When I add a basic ALTQ
rule, such as:
altq on trunk1 cbq bandwidth 2Gb queue { INTERNAL }
queue INTERNAL bandwidth 2Gb cbq(default)
iperf drops in performance by nearly 400Mbps. I have also tried this on
em9, a /30 shared between the two boxes via a crossover cable, with the
same results.
If it matters, trunk1 contains em2 and em3, with trunkproto lacp, and
uplink to a Cisco ME3400 with 'channel-group 2 mode active !
channel-protocol lacp'.
I have made the following sysctl adjustments:
net.inet.ip.ifq.maxlen=1536 # Despite this, it seems net.inet.ip.ifq.len
is always "0"
net.inet.tcp.recvspace=98304 # Tested tons of these sizes, and found
this is the minimum needed
net.inet.tcp.sendspace=98304 # to acheive gigabit speeds. Was able to
push and pull 939Mbps at
net.inet.udp.recvspace=98304 # this size using iperf(1) across a Cisco
ME3400.
net.inet.udp.sendspace=98304 # see above
kern.maxclusters=32768 # Set this based on the output of `netstat -m`
under usual load
iperf testing w/ ALTQ disabled, across trunk1: (IP addresses masked)
[ 4] local X.X.X.X port 5001 connected with Y.Y.Y.Y port 7638
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.09 GBytes 937 Mbits/sec
iperf testing w/ ALTQ enabled, across the same trunk:
[ 4] local X.X.X.X port 5001 connected with Y.Y.Y.Y port 47456
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 659 MBytes 553 Mbits/sec
This is the output of `netstat -m` while ALTQ is enabled and a test is
being performed. Both machines look very similar, so I will only post one:
171 mbufs in use:
122 mbufs allocated to data
29 mbufs allocated to packet headers
20 mbufs allocated to socket names and addresses
82/1442/32768 mbuf 2048 byte clusters in use (current/peak/max)
0/8/32768 mbuf 4096 byte clusters in use (current/peak/max)
0/8/32768 mbuf 8192 byte clusters in use (current/peak/max)
0/8/32768 mbuf 9216 byte clusters in use (current/peak/max)
0/8/32768 mbuf 12288 byte clusters in use (current/peak/max)
0/8/32768 mbuf 16384 byte clusters in use (current/peak/max)
0/8/32768 mbuf 65536 byte clusters in use (current/peak/max)
3672 Kbytes allocated to network (5% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines
This is the output of `pfctl -vs queue` just after creating the queue
and performing a test:
Server:
queue root_trunk1 on trunk1 bandwidth 2Gb priority 0 cbq( wrr root )
{INTERNAL}
[ pkts: 309242 bytes: 24187312 dropped pkts: 0 bytes:
0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
queue INTERNAL on trunk1 bandwidth 2Gb cbq( default )
[ pkts: 309242 bytes: 24187312 dropped pkts: 0 bytes:
0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
Client:
queue root_trunk1 on trunk1 bandwidth 2Gb priority 0 cbq( wrr root )
{INTERNAL}
[ pkts: 474136 bytes: 716710984 dropped pkts: 0 bytes:
0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
queue INTERNAL on trunk1 bandwidth 2Gb cbq( default )
[ pkts: 474136 bytes: 716710984 dropped pkts: 2 bytes:
3036 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
Note, the qlength does not change any during the actual testing.
This is the output of `vmstat 1 15` while running the test:
Server:
procs memory page disks traps cpu
r b w avm fre flt re pi po fr sr wd0 wd1 int sys cs
us sy id
1 1 0 198472 2156464 65 0 0 0 0 0 0 0 1848 769 198 0
2 98
0 1 0 198472 2156456 25 0 0 0 0 0 0 0 1336 185 113
0 1 99
0 1 0 198472 2156384 8 0 0 0 0 0 0 0 1620 116 26
0 0 100
0 1 0 198484 2156296 14 0 0 0 0 0 0 0 8175 12073 12593
0 18 81
0 1 0 198484 2156288 7 0 0 0 0 0 0 0 9962 15079 15726
0 24 76
0 1 0 198484 2156216 7 0 0 0 0 0 0 0 9943 15334 15677
0 28 72
0 1 0 198484 2156208 7 0 0 0 0 0 0 0 9955 15074 15968
0 26 74
0 1 0 198484 2156136 7 0 0 0 0 0 0 0 9882 15130 15796
0 26 74
1 1 0 198484 2156064 11 0 0 0 0 0 0 0 10130 14935 15820
0 26 74
0 1 0 198488 2156052 7 0 0 0 0 0 0 0 9854 15130 15917
0 26 74
0 1 0 198488 2155984 7 0 0 0 0 0 0 0 9744 15089 15806
0 24 76
0 1 0 198488 2155980 7 0 0 0 0 0 0 0 10038 15207 15856
0 24 76
1 1 0 198488 2155908 11 0 0 0 0 0 0 0 10234 15190 15698
0 25 75
0 1 0 198488 2155900 7 0 0 0 0 0 0 0 2471 1655 1643
0 3 97
0 1 0 198488 2155828 7 0 0 0 0 0 0 0 1426 157 30
0 1 99
Client:
procs memory page disks traps cpu
r b w avm fre flt re pi po fr sr wd0 wd1 int sys cs
us sy id
1 1 0 145220 2610972 24 0 0 0 0 0 0 0 1072 633 103
0 1 99
0 1 0 145220 2610968 25 0 0 0 0 0 0 0 363 174 21
0 0 100
0 1 0 145220 2610964 12 0 0 0 0 0 0 0 623 133 32
0 0 100
0 1 0 145220 2610896 7 0 0 0 0 0 0 0 665 136 28
0 0 100
1 1 0 146052 2610464 360 0 0 0 0 0 0 0 2925 14287 145
0 19 81
1 1 0 146052 2610460 7 0 0 0 0 0 0 0 3802 18249 22
0 25 75
1 1 0 146052 2610392 7 0 0 0 0 0 0 0 3704 18400 32
0 25 75
1 1 0 146052 2610388 11 0 0 0 0 0 0 0 3898 18282 42
0 25 75
1 1 0 146052 2610388 7 0 0 0 0 0 0 0 3935 18251 26
0 25 75
1 1 0 146056 2610376 7 0 0 0 0 0 0 0 4256 17964 114
0 25 75
1 1 0 146056 2610308 7 0 0 0 0 0 0 0 3869 18278 24
0 25 75
1 1 0 146056 2610300 145 0 0 0 0 0 0 0 3899 19931 63
0 28 72
1 1 0 146056 2610300 11 0 0 0 0 0 0 0 3742 18313 26
0 25 75
1 1 0 146056 2610296 7 0 0 0 0 0 0 0 3793 18248 22
0 25 75
0 1 0 145224 2610656 51 0 0 0 0 0 0 0 1187 3264 122
0 5 95
Lastly, I am including a dmesg below:
OpenBSD 4.6 (GENERIC.RAID) #0: Mon Mar 8 06:44:15 CST 2010
r...@core00:/usr/src/sys/arch/amd64/compile/GENERIC.RAID
real mem = 3755868160 (3581MB)
avail mem = 3632414720 (3464MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.51 @ 0xdfeeb000 (30 entries)
bios0: vendor Phoenix Technologies LTD version "6.00" date 03/26/2008
bios0: Supermicro PDSMU
acpi0 at bios0: rev 0
acpi0: tables DSDT FACP MCFG APIC BOOT SSDT
acpi0: wakeup devices DEV1(S5) DEV3(S5) EXP1(S5) EXP5(S5) EXP6(S5)
PCIB(S5) KBC0(S1) MSE0(S1) COM1(S5) COM2(S5) USB1(S4) USB2(S4) USB3(S4)
USB4(S4) EUSB(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU X3220 @ 2.40GHz, 2394.28 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16,xTPR,NXE,LONG
cpu0: 4MB 64b/line 16-way L2 cache
cpu0: apic clock running at 265MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Xeon(R) CPU X3220 @ 2.40GHz, 2394.00 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16,xTPR,NXE,LONG
cpu1: 4MB 64b/line 16-way L2 cache
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Xeon(R) CPU X3220 @ 2.40GHz, 2394.00 MHz
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16,xTPR,NXE,LONG
cpu2: 4MB 64b/line 16-way L2 cache
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Xeon(R) CPU X3220 @ 2.40GHz, 2394.00 MHz
cpu3:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16,xTPR,NXE,LONG
cpu3: 4MB 64b/line 16-way L2 cache
ioapic0 at mainbus0 apid 4 pa 0xfec00000, version 20, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (DEV1)
acpiprt2 at acpi0: bus 6 (DEV3)
acpiprt3 at acpi0: bus 10 (EXP1)
acpiprt4 at acpi0: bus 13 (EXP5)
acpiprt5 at acpi0: bus 14 (EXP6)
acpiprt6 at acpi0: bus 15 (PCIB)
acpicpu0 at acpi0: PSS
acpicpu1 at acpi0: PSS
acpicpu2 at acpi0: PSS
acpicpu3 at acpi0: PSS
acpibtn0 at acpi0: PWRB
ipmi at mainbus0 not configured
cpu0: Enhanced SpeedStep 2394 MHz: speeds: 2400, 2133, 1867, 1600 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel E7230 Host" rev 0xc0
ppb0 at pci0 dev 1 function 0 "Intel E7230 PCIE" rev 0xc0: apic 4 int 16
(irq 7)
pci1 at ppb0 bus 1
ppb1 at pci1 dev 0 function 0 "PLX PEX 8518" rev 0xac
pci2 at ppb1 bus 2
ppb2 at pci2 dev 1 function 0 "PLX PEX 8518" rev 0xac: apic 4 int 17
(irq 11)
pci3 at ppb2 bus 3
em0 at pci3 dev 0 function 0 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 17 (irq 11), address 00:25:90:00:1b:7c
em1 at pci3 dev 0 function 1 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 18 (irq 5), address 00:25:90:00:1b:7d
ppb3 at pci2 dev 2 function 0 "PLX PEX 8518" rev 0xac: apic 4 int 18 (irq 5)
pci4 at ppb3 bus 4
em2 at pci4 dev 0 function 0 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 18 (irq 5), address 00:25:90:00:1b:7e
em3 at pci4 dev 0 function 1 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 19 (irq 11), address 00:25:90:00:1b:7f
ppb4 at pci0 dev 3 function 0 "Intel 82975X PCIE" rev 0xc0: apic 4 int
16 (irq 7)
pci5 at ppb4 bus 6
ppb5 at pci5 dev 0 function 0 "PLX PEX 8518" rev 0xac
pci6 at ppb5 bus 7
ppb6 at pci6 dev 1 function 0 "PLX PEX 8518" rev 0xac: apic 4 int 17
(irq 11)
pci7 at ppb6 bus 8
em4 at pci7 dev 0 function 0 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 17 (irq 11), address 00:25:90:00:1e:bc
em5 at pci7 dev 0 function 1 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 18 (irq 5), address 00:25:90:00:1e:bd
ppb7 at pci6 dev 2 function 0 "PLX PEX 8518" rev 0xac: apic 4 int 18 (irq 5)
pci8 at ppb7 bus 9
em6 at pci8 dev 0 function 0 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 18 (irq 5), address 00:25:90:00:1e:be
em7 at pci8 dev 0 function 1 "Intel PRO/1000 PT (82571EB)" rev 0x06:
apic 4 int 19 (irq 11), address 00:25:90:00:1e:bf
ppb8 at pci0 dev 28 function 0 "Intel 82801GB PCIE" rev 0x01: apic 4 int
17 (irq 11)
pci9 at ppb8 bus 10
ppb9 at pci0 dev 28 function 4 "Intel 82801G PCIE" rev 0x01: apic 4 int
17 (irq 11)
pci10 at ppb9 bus 13
em8 at pci10 dev 0 function 0 "Intel PRO/1000MT (82573E)" rev 0x03: apic
4 int 16 (irq 7), address 00:25:90:01:76:2a
ppb10 at pci0 dev 28 function 5 "Intel 82801G PCIE" rev 0x01: apic 4 int
16 (irq 7)
pci11 at ppb10 bus 14
em9 at pci11 dev 0 function 0 "Intel PRO/1000MT (82573L)" rev 0x00: apic
4 int 17 (irq 11), address 00:25:90:01:76:2b
uhci0 at pci0 dev 29 function 0 "Intel 82801GB USB" rev 0x01: apic 4 int
23 (irq 10)
uhci1 at pci0 dev 29 function 1 "Intel 82801GB USB" rev 0x01: apic 4 int
19 (irq 11)
uhci2 at pci0 dev 29 function 2 "Intel 82801GB USB" rev 0x01: apic 4 int
18 (irq 5)
uhci3 at pci0 dev 29 function 3 "Intel 82801GB USB" rev 0x01: apic 4 int
16 (irq 7)
ehci0 at pci0 dev 29 function 7 "Intel 82801GB USB" rev 0x01: apic 4 int
23 (irq 10)
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb11 at pci0 dev 30 function 0 "Intel 82801BA Hub-to-PCI" rev 0xe1
pci12 at ppb11 bus 15
vga1 at pci12 dev 0 function 0 "ATI ES1000" rev 0x02
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
radeondrm0 at vga1: apic 4 int 16 (irq 7)
drm0 at radeondrm0
pcib0 at pci0 dev 31 function 0 "Intel 82801GB LPC" rev 0x01
pciide0 at pci0 dev 31 function 1 "Intel 82801GB IDE" rev 0x01: DMA,
channel 0 configured to compatibility, channel 1 configured to compatibility
pciide0: channel 0 disabled (no drives)
pciide0: channel 1 disabled (no drives)
pciide1 at pci0 dev 31 function 2 "Intel 82801GB SATA" rev 0x01: DMA,
channel 0 configured to native-PCI, channel 1 configured to native-PCI
pciide1: using apic 4 int 19 (irq 11) for native-PCI interrupt
wd0 at pciide1 channel 0 drive 0: <ST9160511NS>
wd0: 16-sector PIO, LBA48, 152627MB, 312581808 sectors
wd0(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 5
wd1 at pciide1 channel 1 drive 0: <ST9160511NS>
wd1: 16-sector PIO, LBA48, 152627MB, 312581808 sectors
wd1(pciide1:1:0): using PIO mode 4, Ultra-DMA mode 5
ichiic0 at pci0 dev 31 function 3 "Intel 82801GB SMBus" rev 0x01: apic 4
int 19 (irq 11)
iic0 at ichiic0
lm1 at iic0 addr 0x2d: W83627HF
wbng0 at iic0 addr 0x2f: w83793g
spdmem0 at iic0 addr 0x50: 2GB DDR2 SDRAM non-parity PC2-6400CL5
spdmem1 at iic0 addr 0x52: 2GB DDR2 SDRAM non-parity PC2-6400CL5
usb1 at uhci0: USB revision 1.0
uhub1 at usb1 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb2 at uhci1: USB revision 1.0
uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb3 at uhci2: USB revision 1.0
uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb4 at uhci3: USB revision 1.0
uhub4 at usb4 "Intel UHCI root hub" rev 1.00/1.00 addr 1
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: console
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
spkr0 at pcppi0
wbsio0 at isa0 port 0x2e/2: W83627HF rev 0x41
lm2 at wbsio0 port 0x290/8: W83627HF
lm1 detached
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
mtrr: Pentium Pro MTRR support
Kernelized RAIDframe activated
raid0 at root: (RAID Level 1) total number of sectors is 312046464
(152366 MB) as root
softraid0 at root
root on raid0a
swapmount: no device
Thank you,
James Shupe