On Thu, Jul 22, 2021 at 12:21:47PM +0200, Hrvoje Popovski wrote:
> On 22.7.2021. 0:39, Alexander Bluhm wrote:
> > On Thu, Jul 22, 2021 at 12:06:09AM +0200, Hrvoje Popovski wrote:
> >> I'm combining this and last parallel diff and i can't see any drops in
> >> traffic. Even sending at high rate, traffic through iked or isakmpd is
> >> stable at 150Kpps, which is good ..
> > 
> > Thanks, good news.
> > 
> >> One funny thing is that with top -SHs1 crypto CPU usage is always at 0.00%
> >> Could it be because of aes-ni?
> > 
> > I don't use the crypto thread anymore.  It is idle.  All crypto is
> > done in softnet between handling the network part of ipsec.
> > 
> > Here is the flame graph.  Crypto is running on top of network.
> > http://bluhm.genua.de/perform/results/current/patch-sys-ipsec-noqueue.1/btrace/ssh_perform%40lt13_iperf3_-c10.4.56.36_-P10_-t10-btrace-kstack.0.svg
> > 
> > bluhm
> > 
> 
> Thank you for explanation..
> 
> after hitting box all night, box panic and i was able to reproduce it
> this morning ... I'm not sure but box panic after hour or more of
> sending traffic through iked tunnel ..
> I will try to reproduce it through ipsec tunnel ..

Unfortunately this unlocked `ipsec_in_use' check can't work as expected,
because concurrent pfkey thread could modify it between the check (1)
and netlock (2). You a lucky to hit this :)

 +   /*
 +    * XXXSMP IPsec data structures are not ready to be
 +    * accessed by multiple Network threads in parallel.
 +    */
 +   if (ipsec_in_use) /* (1) */
 +           exclusive_lock = 1;
 +   if (exclusive_lock)
 +           NET_LOCK(); /* (2) */
 +   else
 +           NET_RLOCK_IN_SOFTNET(); /* (2) */
 +
 +   while ((m = ml_dequeue(ml)) != NULL)
 +         (*ifp->if_input)(ifp, m);

This could be fixed by introducing the another rwlock(9) which should
serialize `ipsec_in_use' modifications and check. It's obvious this lock
should be grabbed before netlock and we could keep this order, but I
strictly don't like this hack.

pfkeyv2_output(struct mbuf *mbuf, struct socket *so,
    struct sockaddr *dstaddr, struct mbuf *control)
{
        /* ... */

        sounlock(so, SL_LOCKED);
        rw_enter_write(&ipsec_lock);
        error = pfkeyv2_send(so, message, mbuf->m_pkthdr.len);
        rw_exit_write(&ipsec_lock);
        solock(so);

        /* ... */
}

if_input_process(struct ifnet *ifp, struct mbuf_list *ml)
{
        /* ... */

        /*
         * XXXSMP IPsec data structures are not ready to be
         * accessed by multiple Network threads in parallel.
         */
        rw_enter_read(&ipsec_lock);

        if (ipsec_in_use) /* (1) */
                exclusive_lock = 1;
        if (exclusive_lock)
                NET_LOCK(); /* (2) */
        else
                NET_RLOCK_IN_SOFTNET(); /* (2) */

        rw_exit_read(&ipsec_lock);

        while ((m = ml_dequeue(ml)) != NULL)
                (*ifp->if_input)(ifp, m);

        /* ... */
}

> 
> 
> 
> 
> r620-1# uvm_fault(0xffffffff821a82d8, 0x137, 0, 2) -> e
> kernel: page fault trap, code=0
> Stopped at      tdb_free+0x9c:  movq    %rsi,0(%rdi)
>     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> *310284  77290      0     0x14000      0x200    1K softnet
> tdb_free(ffff8000011832a8) at tdb_free+0x9c
> esp_output(fffffd80a5c2f300,ffff8000011832a8,0,14,9) at esp_output+0x44f
> ipsp_process_packet(fffffd80a5c2f300,ffff8000011832a8,2,0) at
> ipsp_process_packet+0x466
> ip_output_ipsec_send(ffff8000011832a8,fffffd80a5c2f300,ffff800023871cf8,1)
> at ip_output_ipsec_send+0x161
> ip_output(fffffd80a5c2f300,0,ffff800023871cf8,1,0,0) at ip_output+0x8bd
> ip_forward(fffffd80a5c2f300,ffff800000087048,fffffd83b3454bd0,0) at
> ip_forward+0x26a
> ip_input_if(ffff800023871e38,ffff800023871e44,4,0,ffff800000087048) at
> ip_input_if+0x353
> ipv4_input(ffff800000087048,fffffd80a5c2f300) at ipv4_input+0x39
> if_input_process(ffff800000087048,ffff800023871eb8) at if_input_process+0x92
> ifiq_process(ffff800000086c00) at ifiq_process+0x69
> taskq_thread(ffff80000002f200) at taskq_thread+0x81
> end trace frame: 0x0, count: 4
> https://www.openbsd.org/ddb.html describes the minimum info required in
> bug reports.  Insufficient info makes it difficult to find and fix bugs.
> 
> 
> ddb{1}> ps
>    PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
>  94310  423919      1      0  3    0x100083  ttyin         ksh
>  77465  202466      1      0  3    0x100098  poll          cron
>  65594  481620  35139     95  3    0x100092  kqread        smtpd
>   2326   44364  35139    103  3    0x100092  kqread        smtpd
>   1225  215234  35139     95  3    0x100092  kqread        smtpd
>  69590  400378  35139     95  3    0x100092  kqread        smtpd
>   7267   26043  35139     95  3    0x100092  kqread        smtpd
>  38120  331881  35139     95  3    0x100092  kqread        smtpd
>  35139   66391      1      0  3    0x100080  kqread        smtpd
>  79402   86611      1      0  3        0x88  select        sshd
>  43297  495028  53834    101  3    0x100010  netlock       iked
>  73854  239338  53834    101  3    0x100090  kqread        iked
>  79322  243271  53834    101  3    0x100090  kqread        iked
>  53834  169269      1      0  3    0x100080  kqread        iked
>  98497  240479      1      0  3    0x100080  poll          ntpd
>  93883  433035  46155     83  3    0x100092  poll          ntpd
>  46155  241594      1     83  3    0x100092  poll          ntpd
>  60135   75740  14081     73  3    0x100090  kqread        syslogd
>  14081  390346      1      0  3    0x100082  netio         syslogd
>  95646  201039      0      0  3     0x14200  bored         smr
>  54746  191914      0      0  3     0x14200  pgzero        zerothread
>  42039  120601      0      0  3     0x14200  aiodoned      aiodoned
>  23202  404058      0      0  3     0x14200  syncer        update
>  32492  255945      0      0  3     0x14200  cleaner       cleaner
>  63792  183728      0      0  3     0x14200  reaper        reaper
>  85251  251516      0      0  3     0x14200  pgdaemon      pagedaemon
>  70650  399091      0      0  3     0x14200  bored         crynlk
>  93527  443884      0      0  3     0x14200  bored         crypto
>  98145   39889      0      0  3     0x14200  usbtsk        usbtask
>  67334  434779      0      0  3     0x14200  usbatsk       usbatsk
>  86238  520690      0      0  3  0x40014200  acpi0         acpi0
>  52180  314140      0      0  7  0x40014200                idle5
>  23611  433449      0      0  7  0x40014200                idle4
>  40949  492122      0      0  7  0x40014200                idle3
>   1758  355637      0      0  7  0x40014200                idle2
>  24269   99037      0      0  3  0x40014200                idle1
>  26630  422993      0      0  3     0x14200  bored         sensors
> *77290  310284      0      0  7     0x14200                softnet
>  68084  469615      0      0  3     0x14200  bored         softnet
>  68120  339291      0      0  3     0x14200  bored         softnet
>  90122    1101      0      0  3     0x14200  netlock       softnet
>   4721  339211      0      0  3     0x14200  bored         systqmp
>  19922   57880      0      0  3     0x14200  bored         systq
>  67069  141786      0      0  3  0x40014200  netlock       softclock
>  34327   87131      0      0  7  0x40014200                idle0
>      1   97004      0      0  3        0x82  wait          init
>      0       0     -1      0  3     0x10200  scheduler     swapper
> 
> 
> ddb{1}> trace /t 0t310284
> ip_input_if(ffff800023871e38,ffff800023871e44,4,0,ffff800000087048) at
> ip_input_if+0x353
> ipv4_input(ffff800000087048,fffffd80a5c2f300) at ipv4_input+0x39
> if_input_process(ffff800000087048,ffff800023871eb8) at if_input_process+0x92
> ifiq_process(ffff800000086c00) at ifiq_process+0x69
> taskq_thread(ffff80000002f200) at taskq_thread+0x81
> end trace frame: 0x0, count: -5
> ddb{1}>
> 

Reply via email to