On Thu, 25 Feb 2021 17:24:48 +0200 Tariq Toukan <ttoukan.li...@gmail.com> wrote:
> > Hi, > > > > Issue still reproduces. Even in GA kernel. > > It is always preceded by some other lockdep warning. > > > > So to get the reproduction: > > - First, have any lockdep issue. > > - Then, open bond interface. > > > > Any idea what could it be? > > > > We'll share any new info as soon as we have it. Looks like you are triggering: int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave) { struct bond_up_slave *usable_slaves = NULL, *all_slaves = NULL; struct slave *slave; struct list_head *iter; int agg_id = 0; int ret = 0; #ifdef CONFIG_LOCKDEP WARN_ON(lockdep_is_held(&bond->mode_lock)); #endif And the below commit made lockdep_is_held() always return true if lockdep has been previously triggered. That is, if you had a lockdep splat earlier, then lockdep_is_held() will always return true, and this WARN_ON() will always trigger. Peter, Perhaps we should not have this part of your patch: @@ -5056,13 +5081,13 @@ noinstr int lock_is_held_type(const struct lockdep_map *lock, int read) unsigned long flags; int ret = 0; - if (unlikely(current->lockdep_recursion)) + if (unlikely(!lockdep_enabled())) return 1; /* avoid false negative lockdep_assert_held() */ raw_local_irq_save(flags); check_flags(flags); Because that changes how lock_is_held_type() behaves, and it will return true if there's been an earlier lockdep splat, and any code that has something like the above is going to fail. Although, checking if a lot is not held seems rather strange. If anything, the above should be changed to WARN_ON_ONCE() so that it doesn't constantly trigger when a lockdep trigger happens. -- Steve > > > > Regards, > > Tariq > > > Bisect shows this is the offending commit: > > commit 4d004099a668c41522242aa146a38cc4eb59cb1e > Author: Peter Zijlstra <pet...@infradead.org> > Date: Fri Oct 2 11:04:21 2020 +0200 > > lockdep: Fix lockdep recursion > > Steve reported that lockdep_assert*irq*(), when nested inside lockdep > itself, will trigger a false-positive. > > One example is the stack-trace code, as called from inside lockdep, > triggering tracing, which in turn calls RCU, which then uses > lockdep_assert_irqs_disabled(). > > Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} > to per-cpu variables") > Reported-by: Steven Rostedt <rost...@goodmis.org> > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> > Signed-off-by: Ingo Molnar <mi...@kernel.org>