On Mon, Mar 21, 2016 at 01:25:08PM -0400, Bob Copeland wrote:
> mesh_path_tbl expired is called from mesh housekeeping.  I'm not seeing
> immediately where there might be a bad pointer unless there is a race with
> initialization and sdata->u.mesh.mesh_paths is still null.
> 
> But this does look wrong: sdata lock should be held here but we do a
> GFP_KERNEL allocation.

Please disregard, of course we are in a sleeping lock which was why I
used GFP_KERNEL in the first place.

I gave it some testing between an ath9k and ath9k_htc node this morning
and did not hit a crash, but I did find an issue reported by lockdep:

[  636.904173]  Possible interrupt unsafe locking scenario:

[  636.904175]        CPU0                    CPU1
[  636.904176]        ----                    ----
[  636.904178]   lock(&(&ht->lock)->rlock);
[  636.904180]                                local_irq_disable();
[  636.904181]                                
lock(&(&sta->mesh->plink_lock)->rlock);
[  636.904184]                                lock(&(&ht->lock)->rlock);
[  636.904186]   <Interrupt>
[  636.904187]     lock(&(&sta->mesh->plink_lock)->rlock);
[  636.904189] 
 *** DEADLOCK ***

__mesh_plink_deactivate -> rhashtable_walk in process context does

  spin_lock_bh(plink_lock)
    -> spin_lock(rhash_lock)

whereas we could be in rhashtable_walk through some different path, and do

  spin_lock(rhash_lock)
  [mesh_plink_timer fires]
    spin_lock(plink_lock)

So I guess all rhashtable walks need local_bh_disable() too.  But that
should look like a deadlock, not a crash.

Although it doesn't look like your crash, there is a fix for one
shutdown-related race here, you can try that:

    https://patchwork.kernel.org/patch/8624601/

-- 
Bob Copeland %% http://bobcopeland.com/
_______________________________________________
Devel mailing list
[email protected]
http://lists.open80211s.org/mailman/listinfo/devel

Reply via email to