On Thu, Jul 08, 2021 at 08:08:23AM +0200, Hrvoje Popovski wrote: > On 8.7.2021. 0:10, Vitaliy Makkoveev wrote: > > On Wed, Jul 07, 2021 at 11:07:08PM +0200, Hrvoje Popovski wrote: > >> On 7.7.2021. 22:36, Vitaliy Makkoveev wrote: > >>> Thanks. ipsp_spd_lookup() stopped panic in pool_get(9). > >>> > >>> I guess the panics continue because simultaneous modifications of > >>> 'tdbp->tdb_policy_head' break it. Could you try the diff below? It > >>> introduces `tdb_polhd_mtx' mutex(9) and uses it to protect > >>> 'tdbp->tdb_policy_head' modifications. I don't propose this diff for > >>> commit but to check my suggestion. > >> > >> > >> Hi, > >> > >> with this diff i'm getting this panic > >> > >> r620-1# panic: acquiring blockable sleep lock with spinlock or critical > >> section held (kernel_lock) &kernel_lock > >> Stopped at db_enter+0x10: popq %rbp > >> TID PID UID PRFLAGS PFLAGS CPU COMMAND > >> 375321 87823 0 0x14000 0x200 5 crynlk > >> 455594 99250 0 0x14000 0x200 0 crypto > >> 124997 16472 0 0x14000 0x200 1 softnet > >> 409214 30226 0 0x14000 0x200 3 softnet > >> 347403 66039 0 0x14000 0x200 4 softnet > >> *345146 25512 0 0x14000 0x200 2 softnet > >> db_enter() at db_enter+0x10 > >> panic(ffffffff81e7ce76) at panic+0xbf > >> witness_checkorder(ffffffff82348dc0,9,0) at witness_checkorder+0xbce > >> __mp_lock(ffffffff82348bb8) at __mp_lock+0x5f > >> kpageflttrap(ffff800023864a30,147) at kpageflttrap+0x178 > >> kerntrap(ffff800023864a30) at kerntrap+0x91 > >> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b > >> ipsp_spd_lookup(fffffd80a05e9200,2,14,ffff800023864d0c,2,0) at > >> ipsp_spd_lookup+0x9fd > >> ip_output_ipsec_lookup(fffffd80a05e9200,14,ffff800023864d0c,0,0) at > >> ip_output_ipsec_lookup+0x4d > >> ip_output(fffffd80a05e9200,0,ffff800023864e98,1,0,0) at ip_output+0x42a > >> ip_forward(fffffd80a05e9200,ffff800000087048,fffffd83b39799a8,0) at > >> ip_forward+0x26a > >> ip_input_if(ffff800023864fd8,ffff800023864fe4,4,0,ffff800000087048) at > >> ip_input_if+0x365 > >> ipv4_input(ffff800000087048,fffffd80a05e9200) at ipv4_input+0x39 > >> if_input_process(ffff800000087048,ffff800023865058) at > >> if_input_process+0x6f > >> end trace frame: 0xffff8000238650a0, count: 0 > >> https://www.openbsd.org/ddb.html describes the minimum info required in > >> bug reports. Insufficient info makes it difficult to find and fix bugs. > >> ddb{2}> > >> > >> ddb{2}> show locks > >> shared rwlock netlock r = 0 (0xffffffff8219ce60) > >> #0 witness_lock+0x339 > >> #1 if_input_process+0x43 > >> #2 ifiq_process+0x69 > >> #3 taskq_thread+0x9f > >> #4 proc_trampoline+0x1c > >> shared rwlock softnet r = 0 (0xffff800000030070) > >> #0 witness_lock+0x339 > >> #1 taskq_thread+0x92 > >> #2 proc_trampoline+0x1c > >> exclusive mutex /sys/netinet/ip_ipsp.c:95 r = 0 (0xffffffff82192398) > >> #0 witness_lock+0x339 > >> #1 mtx_enter_try+0x95 > >> #2 mtx_enter+0x48 > >> #3 ipsp_spd_lookup+0x961 > >> #4 ip_output_ipsec_lookup+0x4d > >> #5 ip_output+0x42a > >> #6 ip_forward+0x26a > >> #7 ip_input_if+0x365 > >> #8 ipv4_input+0x39 > >> #9 if_input_process+0x6f > >> #10 ifiq_process+0x69 > >> #11 taskq_thread+0x9f > >> #12 proc_trampoline+0x1c > >> > > > > Thanks. > > > > Now panics only in ipsp_spd_lookup() and never in pfkeyv2_send() or in > > tdb_free() called from pfkeyv2_send(), right? > > > > Yes, > > i can only trigger this panic >
Thanks. That means simultaneous ipsp_spd_lookup() execution breaks not only `tdb_policy_head' but the 'ipo->ipo_tdb' pointer too. Also I like to remind, about the logic we have in sys/net/pfkeyv2.c: 2017 /* 2018 * XXXSMP IPsec data structures are not ready to be 2019 * accessed by multiple Network threads in parallel, 2020 * so force all packets to be processed by the first 2021 * one. 2022 */ 2023 extern int nettaskqs; 2024 nettaskqs = 1; It seems to be not working with parallel forwarding diff.