On Thu, Jul 08, 2021 at 08:08:23AM +0200, Hrvoje Popovski wrote:
> On 8.7.2021. 0:10, Vitaliy Makkoveev wrote:
> > On Wed, Jul 07, 2021 at 11:07:08PM +0200, Hrvoje Popovski wrote:
> >> On 7.7.2021. 22:36, Vitaliy Makkoveev wrote:
> >>> Thanks. ipsp_spd_lookup() stopped panic in pool_get(9).
> >>>
> >>> I guess the panics continue because simultaneous modifications of
> >>> 'tdbp->tdb_policy_head' break it. Could you try the diff below? It
> >>> introduces `tdb_polhd_mtx' mutex(9) and uses it to protect
> >>> 'tdbp->tdb_policy_head' modifications. I don't propose this diff for
> >>> commit but to check my suggestion.
> >>
> >>
> >> Hi,
> >>
> >> with this diff i'm getting this panic
> >>
> >> r620-1# panic: acquiring blockable sleep lock with spinlock or critical
> >> section held (kernel_lock) &kernel_lock
> >> Stopped at      db_enter+0x10:  popq    %rbp
> >>     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> >>  375321  87823      0     0x14000      0x200    5  crynlk
> >>  455594  99250      0     0x14000      0x200    0  crypto
> >>  124997  16472      0     0x14000      0x200    1  softnet
> >>  409214  30226      0     0x14000      0x200    3  softnet
> >>  347403  66039      0     0x14000      0x200    4  softnet
> >> *345146  25512      0     0x14000      0x200    2  softnet
> >> db_enter() at db_enter+0x10
> >> panic(ffffffff81e7ce76) at panic+0xbf
> >> witness_checkorder(ffffffff82348dc0,9,0) at witness_checkorder+0xbce
> >> __mp_lock(ffffffff82348bb8) at __mp_lock+0x5f
> >> kpageflttrap(ffff800023864a30,147) at kpageflttrap+0x178
> >> kerntrap(ffff800023864a30) at kerntrap+0x91
> >> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> >> ipsp_spd_lookup(fffffd80a05e9200,2,14,ffff800023864d0c,2,0) at
> >> ipsp_spd_lookup+0x9fd
> >> ip_output_ipsec_lookup(fffffd80a05e9200,14,ffff800023864d0c,0,0) at
> >> ip_output_ipsec_lookup+0x4d
> >> ip_output(fffffd80a05e9200,0,ffff800023864e98,1,0,0) at ip_output+0x42a
> >> ip_forward(fffffd80a05e9200,ffff800000087048,fffffd83b39799a8,0) at
> >> ip_forward+0x26a
> >> ip_input_if(ffff800023864fd8,ffff800023864fe4,4,0,ffff800000087048) at
> >> ip_input_if+0x365
> >> ipv4_input(ffff800000087048,fffffd80a05e9200) at ipv4_input+0x39
> >> if_input_process(ffff800000087048,ffff800023865058) at 
> >> if_input_process+0x6f
> >> end trace frame: 0xffff8000238650a0, count: 0
> >> https://www.openbsd.org/ddb.html describes the minimum info required in
> >> bug reports.  Insufficient info makes it difficult to find and fix bugs.
> >> ddb{2}>
> >>
> >> ddb{2}> show locks
> >> shared rwlock netlock r = 0 (0xffffffff8219ce60)
> >> #0  witness_lock+0x339
> >> #1  if_input_process+0x43
> >> #2  ifiq_process+0x69
> >> #3  taskq_thread+0x9f
> >> #4  proc_trampoline+0x1c
> >> shared rwlock softnet r = 0 (0xffff800000030070)
> >> #0  witness_lock+0x339
> >> #1  taskq_thread+0x92
> >> #2  proc_trampoline+0x1c
> >> exclusive mutex /sys/netinet/ip_ipsp.c:95 r = 0 (0xffffffff82192398)
> >> #0  witness_lock+0x339
> >> #1  mtx_enter_try+0x95
> >> #2  mtx_enter+0x48
> >> #3  ipsp_spd_lookup+0x961
> >> #4  ip_output_ipsec_lookup+0x4d
> >> #5  ip_output+0x42a
> >> #6  ip_forward+0x26a
> >> #7  ip_input_if+0x365
> >> #8  ipv4_input+0x39
> >> #9  if_input_process+0x6f
> >> #10 ifiq_process+0x69
> >> #11 taskq_thread+0x9f
> >> #12 proc_trampoline+0x1c
> >>
> > 
> > Thanks.
> > 
> > Now panics only in ipsp_spd_lookup() and never in pfkeyv2_send() or in
> > tdb_free() called from pfkeyv2_send(), right?
> > 
> 
> Yes,
> 
> i can only trigger this panic
> 

Thanks.

That means simultaneous ipsp_spd_lookup() execution breaks not only
`tdb_policy_head' but the 'ipo->ipo_tdb' pointer too.

Also I like to remind, about the logic we have in sys/net/pfkeyv2.c:

2017            /*
2018             * XXXSMP IPsec data structures are not ready to be
2019             * accessed by multiple Network threads in parallel,
2020             * so force all packets to be processed by the first            
2021             * one.
2022             */
2023            extern int nettaskqs;
2024            nettaskqs = 1;

It seems to be not working with parallel forwarding diff.

Reply via email to