On Mon, Mar 21, 2016 at 01:25:08PM -0400, Bob Copeland wrote:
> mesh_path_tbl expired is called from mesh housekeeping. I'm not seeing
> immediately where there might be a bad pointer unless there is a race with
> initialization and sdata->u.mesh.mesh_paths is still null.
>
> But this does look wrong: sdata lock should be held here but we do a
> GFP_KERNEL allocation.
Please disregard, of course we are in a sleeping lock which was why I
used GFP_KERNEL in the first place.
I gave it some testing between an ath9k and ath9k_htc node this morning
and did not hit a crash, but I did find an issue reported by lockdep:
[ 636.904173] Possible interrupt unsafe locking scenario:
[ 636.904175] CPU0 CPU1
[ 636.904176] ---- ----
[ 636.904178] lock(&(&ht->lock)->rlock);
[ 636.904180] local_irq_disable();
[ 636.904181]
lock(&(&sta->mesh->plink_lock)->rlock);
[ 636.904184] lock(&(&ht->lock)->rlock);
[ 636.904186] <Interrupt>
[ 636.904187] lock(&(&sta->mesh->plink_lock)->rlock);
[ 636.904189]
*** DEADLOCK ***
__mesh_plink_deactivate -> rhashtable_walk in process context does
spin_lock_bh(plink_lock)
-> spin_lock(rhash_lock)
whereas we could be in rhashtable_walk through some different path, and do
spin_lock(rhash_lock)
[mesh_plink_timer fires]
spin_lock(plink_lock)
So I guess all rhashtable walks need local_bh_disable() too. But that
should look like a deadlock, not a crash.
Although it doesn't look like your crash, there is a fix for one
shutdown-related race here, you can try that:
https://patchwork.kernel.org/patch/8624601/
--
Bob Copeland %% http://bobcopeland.com/
_______________________________________________
Devel mailing list
[email protected]
http://lists.open80211s.org/mailman/listinfo/devel