>Synopsis:      kernel deadlock
>Environment:
        System      : OpenBSD 6.1
        Details     : OpenBSD 6.1-current (GENERIC) #14: Mon Apr 24 09:32:13 
MDT 2017
                         
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC

        Architecture: OpenBSD.amd64
        Machine     : amd64
>Description:

Setup:
fast tcp sender(s)>relayd>slow backend

If senders send more than backend can handle it will slowly accumulate in the 
openbsd
box running relayd. Eventually the box runs out of memory and swap and 
deadlocks.
Unfortunately its still able to send carp hellos and remains carp master. I'm 
able to break
into ddb. One of the relayd processes sits either in flt_noram1 or flt_noram5. 
It will never
resolve by itself even if I stop the senders. In the production setup I have 1G 
of memory.
It happens with and without socket splicing.

ddb> tr
Debugger() at Debugger+0x9
internal_command() at internal_command+0x155
wskbd_translate() at wskbd_translate+0x90
wskbd_input() at wskbd_input+0x59
pckbd_input() at pckbd_input+0xb4
pckbcintr_internal() at pckbcintr_internal+0x8e
intr_handler() at intr_handler+0x28
Xintr_ioapic_edge1() at Xintr_ioapic_edge1+0xc9
--- interrupt ---
pmap_test_attrs() at pmap_test_attrs+0xd
uvmpd_scan_inactive() at uvmpd_scan_inactive+0xfd
uvmpd_scan() at uvmpd_scan+0x228
uvm_pageout() at uvm_pageout+0x6b
end trace frame: 0x0, count: -12
ddb> ps
   PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
 18719    1258      0      0  3     0x14200  bored         sosplice
  7448  115595      1      0  2    0x100083                getty
 92498  379625      1      0  3    0x100083  ttyin         getty
  4520  143232      1      0  3    0x100083  ttyin         getty
 39305  310613      1      0  3    0x100083  ttyin         getty
 21297  227528      1      0  3    0x100083  ttyin         getty
 37915  337975      1      0  3    0x100083  ttyin         getty
 65468  508956      1      0  3    0x100098  poll          cron
 47792  194899  52742      0  3    0x100080  netio         tcpdump
 52742  468181   3282     76  2    0x100012                tcpdump
  3282  255291      1      0  3        0x80  piperd        perl
 50646  227272      1    110  3    0x100090  poll          sndiod
 44710  113903      1     99  3    0x100090  poll          sndiod
 86206  425755      1      0  3    0x100080  kqread        httpd
 59150  213389      1     67  3    0x100092  kqread        httpd
 91591  456586      1     67  3    0x100092  kqread        httpd
 73088   84908      1     67  3    0x100092  kqread        httpd
 40756  141226      1     67  3    0x100092  kqread        httpd
  2509  326497  28065     95  3    0x100092  kqread        smtpd
 94398  422696  28065    103  3    0x100092  kqread        smtpd
 72362  413118  28065     95  3    0x100092  kqread        smtpd
 21002  233627  28065     95  3    0x100092  kqread        smtpd
 96593  426294  28065     95  3    0x100092  kqread        smtpd
 17571  429923  28065     95  3    0x100092  kqread        smtpd
 28065  265739      1      0  3    0x100080  kqread        smtpd
 54616  144829      1      0  3        0x80  kqread        relayd
  3030  324195      1     89  3    0x100092  kqread        relayd
 71847  347768      1     89  3    0x100092  kqread        relayd
 79236  455595      1     89  3    0x100092  kqread        relayd
 26126  339624      1     89  3    0x100092  kqread        relayd
 79991  357923      1     89  3    0x100092  kqread        relayd
 26908   50243      1     89  3    0x100092  kqread        relayd
 39372  476224      1     89  3    0x100092  kqread        relayd
  6798  425753      1     89  3    0x100092  kqread        relayd
 24266    6018      1     89  3    0x100092  kqread        relayd
 90612   10267      1     89  3    0x100092  kqread        relayd
 74770   24617      1     89  3    0x100092  kqread        relayd
 16191    7266      1     89  3    0x100092  kqread        relayd
 48641   91345      1     89  3    0x100092  kqread        relayd
 11724  492172      1     89  3    0x100092  kqread        relayd
 76212  506506      1     89  3    0x100092  kqread        relayd
 97858  156934      1     89  3    0x100092  kqread        relayd
 30127  166905      1     89  3    0x100092  kqread        relayd
 42359  374245      1     89  3    0x100092  kqread        relayd
 46692  453103      1     89  3    0x100092  kqread        relayd
 85725   63743      1     89  3    0x100092  kqread        relayd
 26947  119020      1     89  3    0x100012  flt_noram5    relayd
 31527  446625      1     89  2    0x100012                relayd
  9481  307364      1     89  2    0x100012                relayd
 28627  457378      1     89  2    0x100012                relayd
 20393  160917      1     89  2    0x100012                relayd
 37217  203020      1     89  2    0x100012                relayd
 81784  344990      1     89  2    0x100012                relayd
 55474  165735      1     89  2    0x100012                relayd
 75308  298087      1     89  2    0x100012                relayd
 63207  267659      1     89  2    0x100012                relayd
 62744  101175      1     89  2    0x100012                relayd
 44185   31553      1     89  2    0x100012                relayd
 25683  193151      1     89  2    0x100012                relayd
 68522  355887      1     89  2    0x100012                relayd
 65481   59946      1     89  2    0x100012                relayd
 58086   15299      1     89  2    0x100012                relayd
 21528  229769      1     89  2    0x100012                relayd
 46044  485924      1     89  2    0x100012                relayd
 16225  486879      1     89  2    0x100012                relayd
 46087   30050      1     89  2    0x100012                relayd
 92929  218662      1     89  3    0x100092  kqread        relayd
 63821  320603      1     89  3    0x100092  kqread        relayd
 51579  318273  84588     75  2    0x100492                bgpd
 90700  335378  84588     75  3    0x100092  poll          bgpd
 84588   28750      1      0  2        0x80                bgpd
 87112  506584      1      0  3        0x80  kqread        snmpd
 91230   93464      1     91  3        0x92  kqread        snmpd
 89920  159381      1     91  2        0x92                snmpd
  7220   98830      1      0  3        0x80  select        sshd
 27241  420362      1      0  3    0x100080  poll          ntpd
 44141  514243   8284     83  3    0x100092  poll          ntpd
  8284  145894      1     83  2    0x100492                ntpd
 97598  219876  67025     74  2    0x100490                pflogd
 67025  518783      1      0  3        0x80  netio         pflogd
 62641  259012  51899     73  3    0x100090  kqread        syslogd
 51899  308670      1      0  3    0x100082  netio         syslogd
 74861  361044      0      0  2     0x14200                zerothread
 40078    4840      0      0  3     0x14200  aiodoned      aiodoned
 11396  394316      0      0  2     0x14200                update
 95110  498998      0      0  3     0x14200  cleaner       cleaner
 78284  374918      0      0  3     0x14200  reaper        reaper
*92804  338862      0      0  7     0x14200                pagedaemon
  9723  206392      0      0  3     0x14200  bored         crynlk
 19328  289759      0      0  3     0x14200  bored         crypto
 10531  115280      0      0  2     0x14600                pfpurge
 49817  508545      0      0  3     0x14200  bored         viomb
 81860  106767      0      0  3     0x14200  usbtsk        usbtask
 24646  211498      0      0  3     0x14200  usbatsk       usbatsk
 93940   99018      0      0  3  0x40014200  acpi0         acpi0
 67453   88354      0      0  2     0x14200                softnet
 52383  373249      0      0  2     0x14200                systqmp
 51985  389922      0      0  2     0x14200                systq
 29482  267706      0      0  3  0x40014200  bored         softclock
 41470   14506      0      0  3  0x40014200                idle0
     1   86322      0      0  3        0x82  wait          init
     0       0     -1      0  3     0x10200  scheduler     swapper
ddb> tr /p 0t119020
sleep_finish() at sleep_finish+0x86
msleep() at msleep+0x10f
uvm_fault() at uvm_fault+0xd9e
trap() at trap+0x283
--- trap (number 6) ---
end of kernel
end trace frame: 0xf6fd83a71f0, count: -4
0xf6fd866b5e4:

I have also seen:
ddb> tr
Debugger() at Debugger+0x9
internal_command() at internal_command+0x155
wskbd_translate() at wskbd_translate+0x90
wskbd_input() at wskbd_input+0x59
pckbd_input() at pckbd_input+0xb4
pckbcintr_internal() at pckbcintr_internal+0x8e
intr_handler() at intr_handler+0x28
Xintr_ioapic_edge1() at Xintr_ioapic_edge1+0xc9
--- interrupt ---
Xspllower() at Xspllower+0xc
__mtx_leave() at __mtx_leave+0x34
extent_do_alloc() at extent_do_alloc+0x60e
extent_alloc_subregion() at extent_alloc_subregion+0xa5
uvm_swap_alloc() at uvm_swap_alloc+0xb3
uvmpd_scan_inactive() at uvmpd_scan_inactive+0x54b
uvmpd_scan() at uvmpd_scan+0x29e
uvm_pageout() at uvm_pageout+0x6b
end trace frame: 0x0, count: -16
ddb> ps
   PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
 98510  454082      0      0  3     0x14200  bored         sosplice
 61020  167295   6510      0  3    0x100080  netio         tcpdump
 23052  453771      1      0  2    0x100083                getty
 99310  304541      1      0  3    0x100083  ttyin         getty
 59992   46375      1      0  3    0x100083  ttyin         getty
 94180   48977      1      0  3    0x100083  ttyin         getty
 31867  368727      1      0  3    0x100083  ttyin         getty
 65827  183581      1      0  3    0x100083  ttyin         getty
 28126  292282      1      0  2    0x100010                cron
  6510   87605  59875     76  2    0x100012                tcpdump
 59875   37080      1      0  3        0x80  piperd        perl
 48125  216072      1    110  3    0x100090  poll          sndiod
 50613  296553      1     99  3    0x100090  poll          sndiod
 18652  448828      1      0  3    0x100080  kqread        httpd
 18945  502467      1     67  3    0x100092  kqread        httpd
  4132    6807      1     67  3    0x100092  kqread        httpd
 98138  255433      1     67  3    0x100092  kqread        httpd
 36567  323506      1     67  3    0x100092  kqread        httpd
 42835  234930  37277     95  3    0x100092  kqread        smtpd
  7470  157304  37277    103  3    0x100092  kqread        smtpd
 93610  470331  37277     95  3    0x100092  kqread        smtpd
 52521  493948  37277     95  3    0x100092  kqread        smtpd
 92601  131987  37277     95  3    0x100092  kqread        smtpd
 32371  229192  37277     95  3    0x100092  kqread        smtpd
 37277  266758      1      0  3    0x100080  kqread        smtpd
 85154  144921      1      0  3        0x80  kqread        relayd
 73872  403967      1     89  3    0x100092  kqread        relayd
 83871  323853      1     89  3    0x100092  kqread        relayd
 72192  107185      1     89  3    0x100092  kqread        relayd
 63728  434713      1     89  3    0x100092  kqread        relayd
  7463  237568      1     89  3    0x100092  kqread        relayd
  8573   10618      1     89  3    0x100092  kqread        relayd
 95560  222242      1     89  3    0x100092  kqread        relayd
 93731  388665      1     89  3    0x100092  kqread        relayd
 92056  453261      1     89  3    0x100092  kqread        relayd
 59991  293357      1     89  3    0x100092  kqread        relayd
 86473   71592      1     89  3    0x100092  kqread        relayd
 32379   28918      1     89  3    0x100092  kqread        relayd
 79429  244555      1     89  3    0x100092  kqread        relayd
 63847   15559      1     89  3    0x100092  kqread        relayd
 50075  349175      1     89  3    0x100092  kqread        relayd
  6058  159484      1     89  3    0x100092  kqread        relayd
 16880  367020      1     89  3    0x100092  kqread        relayd
   136  140946      1     89  3    0x100092  kqread        relayd
 13867  140645      1     89  3    0x100092  kqread        relayd
 70441   51568      1     89  3    0x100092  kqread        relayd
 55683    2953      1     89  2    0x100012                relayd
 94827  327279      1     89  2    0x100012                relayd
 83117  363844      1     89  2    0x100012                relayd
 49903   68611      1     89  2    0x100012                relayd
 46080  166396      1     89  2    0x100012                relayd
 29597  175697      1     89  2    0x100012                relayd
 39496  362805      1     89  2    0x100012                relayd
 19019  294153      1     89  2    0x100012                relayd
 45967   90936      1     89  2    0x100012                relayd
 26992  149427      1     89  2    0x100012                relayd
 60364  206396      1     89  2    0x100012                relayd
 49833  356433      1     89  2    0x100012                relayd
 79112  503654      1     89  2    0x100012                relayd
 81078  319693      1     89  2    0x100012                relayd
 78058  140027      1     89  2    0x100012                relayd
 91182  184681      1     89  2    0x100012                relayd
 37959  222788      1     89  2    0x100012                relayd
  7050  276440      1     89  2    0x100012                relayd
 54548   85994      1     89  2    0x100012                relayd
 34773  213423      1     89  2    0x100012                relayd
  3552  125399      1     89  3    0x100092  kqread        relayd
 89897  452096      1     89  3    0x100092  kqread        relayd
 35273  402937  90827     75  2    0x100012                bgpd
 24240  293596  90827     75  3    0x100092  poll          bgpd
 90827    4256      1      0  2        0x80                bgpd
 72809  171423      1      0  3        0x80  kqread        snmpd
 92764  212816      1     91  3        0x92  kqread        snmpd
 30459  414910      1     91  3        0x12  flt_noram1    snmpd
 98342   18762      1      0  3        0x80  select        sshd
 91361    6408      1      0  3    0x100080  poll          ntpd
 92053  373229  95928     83  3    0x100092  poll          ntpd
 95928  207794      1     83  2    0x100492                ntpd
 27466  156798  83367     74  2    0x100490                pflogd
 83367  198766      1      0  3        0x80  netio         pflogd
  7678  511218  31604     73  3    0x100090  kqread        syslogd
 31604  252824      1      0  3    0x100082  netio         syslogd
 23988  322338      0      0  3     0x14200  pgzero        zerothread
 24605  125985      0      0  3     0x14200  aiodoned      aiodoned
 73480  399390      0      0  2     0x14200                update
 41800  367817      0      0  3     0x14200  cleaner       cleaner
 81146   69342      0      0  3     0x14200  reaper        reaper
*25140   16722      0      0  7     0x14200                pagedaemon
 38872  493720      0      0  3     0x14200  bored         crynlk
 83120  104525      0      0  3     0x14200  bored         crypto
 44202   60584      0      0  2     0x14600                pfpurge
 42782  359699      0      0  3     0x14200  bored         viomb
 79739  304953      0      0  3     0x14200  usbtsk        usbtask
 89868  403768      0      0  3     0x14200  usbatsk       usbatsk
 35580  285075      0      0  3  0x40014200  acpi0         acpi0
 78992   92301      0      0  2     0x14200                softnet
 14178  471898      0      0  2     0x14200                systqmp
 86701   48006      0      0  2     0x14200                systq
 75335   15065      0      0  3  0x40014200  bored         softclock
 16772  484410      0      0  3  0x40014200                idle0
     1  192841      0      0  3        0x82  wait          init
     0       0     -1      0  3     0x10200  scheduler     swapper
ddb> tr /p 0t414910
sleep_finish() at sleep_finish+0x86
msleep() at msleep+0x10f
uvmfault_anonget() at uvmfault_anonget+0x1d3
uvm_fault() at uvm_fault+0x468
trap() at trap+0x283
--- trap (number 6) ---
end of kernel
end trace frame: 0x5e3d2983000, count: -5
0x5e3750d73b6:

>How-To-Repeat:
I have created VM with smaller ammount of memory to reproduce it faster.

/etc/relayd.conf:
prefork 20
log updates
interval 3
timeout 2000

table <test2> { 10.33.192.66 }

relay test2 {
  listen on 10.33.200.17 port 6000
  forward to <test2> port 6000
}

ruby test client:
require 'socket'
threads = []
str = "plah" * 10000
200.times do |t|
  threads << Thread.new(t) do |th|
    loop do
      begin
      socket = TCPSocket.new "10.33.200.17 ", 6000
      socket.write str
      socket.close
      rescue
      end
    end
  end
end

threads.each { |th| th.join }

ruby test backend server:
require 'socket'
begin
  acceptor = TCPServer.open("0.0.0.0", 6000)
  fds = [acceptor]
  while true
    if ios = select(fds, [], [], 10)
      reads = ios.first
      reads.each do |client|
        if client == acceptor
          client, sockaddr = acceptor.accept
          fds << client
        elsif client.eof?
          fds.delete(client)
          client.close
        else
          sleep 1
          client.gets
        end
      end
    end
  end
ensure
  fds.each {|c| c.close}
end

>Fix:
Not known.

dmesg:
OpenBSD 6.1-current (GENERIC) #14: Mon Apr 24 09:32:13 MDT 2017
    [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC
real mem = 137203712 (130MB)
avail mem = 128581632 (122MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf69d0 (9 entries)
bios0: vendor SeaBIOS version "1.9.1-1.fc24" date 04/01/2014
bios0: QEMU Standard PC (i440FX + PIIX, 1996)
acpi0 at bios0: rev 0
acpi0: sleep states S5
acpi0: tables DSDT FACP APIC
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel Core Processor (Haswell, no TSX), 2594.03 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,ARAT
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 1000MHz
ioapic0 at mainbus0: apid 0 pa 0xfec00000, version 11, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0: C1(@1 halt!)
"ACPI0006" at acpi0 not configured
"PNP0303" at acpi0 not configured
"PNP0F13" at acpi0 not configured
"PNP0700" at acpi0 not configured
"PNP0501" at acpi0 not configured
"PNP0A06" at acpi0 not configured
"PNP0A06" at acpi0 not configured
"QEMU0002" at acpi0 not configured
"PNP0A06" at acpi0 not configured
pvbus0 at mainbus0: KVM
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00
pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 0 
wired to compatibility, channel 1 wired to compatibility
atapiscsi0 at pciide0 channel 0 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0: <QEMU, QEMU DVD-ROM, 2.5+> ATAPI 5/cdrom removable
cd0(pciide0:0:0): using PIO mode 4, DMA mode 2
pciide0: channel 1 disabled (no drives)
piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x03: apic 0 int 9
iic0 at piixpm0
vga1 at pci0 dev 2 function 0 "Red Hat QXL Video" rev 0x04
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
virtio0 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00
vio0 at virtio0: address 52:54:00:df:65:cc
virtio0: msix shared
virtio1 at pci0 dev 4 function 0 "Qumranet Virtio Network" rev 0x00
vio1 at virtio1: address 52:54:00:6c:c5:79
virtio1: msix shared
azalia0 at pci0 dev 5 function 0 "Intel 82801FB HD Audio" rev 0x01: apic 0 int 
10
azalia0: No codecs found
virtio2 at pci0 dev 6 function 0 "Qumranet Virtio Console" rev 0x00
virtio2: no matching child driver; not configured
uhci0 at pci0 dev 7 function 0 "Intel 82801I USB" rev 0x03: apic 0 int 11
uhci1 at pci0 dev 7 function 1 "Intel 82801I USB" rev 0x03: apic 0 int 11
uhci2 at pci0 dev 7 function 2 "Intel 82801I USB" rev 0x03: apic 0 int 10
ehci0 at pci0 dev 7 function 7 "Intel 82801I USB" rev 0x03: apic 0 int 10
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 
addr 1
virtio3 at pci0 dev 8 function 0 "Qumranet Virtio Storage" rev 0x00
vioblk0 at virtio3
scsibus2 at vioblk0: 2 targets
sd0 at scsibus2 targ 0 lun 0: <VirtIO, Block Device, > SCSI3 0/direct fixed
sd0: 4096MB, 512 bytes/sector, 8388608 sectors
virtio3: msix shared
virtio4 at pci0 dev 9 function 0 "Qumranet Virtio Memory" rev 0x00
viomb0 at virtio4
virtio4: apic 0 int 10
isa0 at pcib0
isadma0 at isa0
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 1: density unknown
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: console
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
usb1 at uhci0: USB revision 1.0
uhub1 at usb1 configuration 1 interface 0 "Intel UHCI root hub" rev 1.00/1.00 
addr 1
usb2 at uhci1: USB revision 1.0
uhub2 at usb2 configuration 1 interface 0 "Intel UHCI root hub" rev 1.00/1.00 
addr 1
usb3 at uhci2: USB revision 1.0
uhub3 at usb3 configuration 1 interface 0 "Intel UHCI root hub" rev 1.00/1.00 
addr 1
vscsi0 at root
scsibus3 at vscsi0: 256 targets
softraid0 at root
scsibus4 at softraid0: 256 targets
root on sd0a (b3164b5c560a5158.a) swap on sd0b dump on sd0b


Rivo

Reply via email to