IIRC, it was easy to reproduce by cranking the rebalance freq
up (1s or even faster) and also introducing a delay of a few
milliseconds in that bond_alb.c:tlb_clear_slave() routine
between where we drop the lock and call tlb_init_slave()

  --Michael O'Donnell  --  Stratus Technologies, Maynard, MA USA

> -----Original Message-----
> From: Jay Vosburgh [mailto:[EMAIL PROTECTED] 
> Sent: Monday, January 09, 2006 3:14 PM
> To: [EMAIL PROTECTED]; [email protected]
> Cc: ODonnell, Michael
> Subject: [PATCH netdev-2.6] bonding: UPDATED hash-table 
> corruption in bond_alb.c
> 
> 
>       I believe I see the race Michael refers to (tlb_choose_channel
> may set head, which tlb_init_slave clears), although I was not able to
> reproduce it.  I have updated his patch for the current netdev-2.6.git
> tree and added a version update.  His original comment follows:
> 
> Our systems have been crashing during testing of PCI HotPlug
> support in the various networking components.  We've faulted in
> the bonding driver due to a bug in bond_alb.c:tlb_clear_slave()
> 
> In that routine, the last modification to the TLB hash table is
> made without protection of the lock, allowing a race that can lead
> tlb_choose_channel() to select an invalid table element.
> 
>       -J
> 
> ---
>       -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
> 
> 
> Signed-off-by: Michael O'Donnell <Michael.ODonnell at stratus dot com>
> Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]>
> 
> --- netdev-2.6.git-upstream/drivers/net/bonding/bond_alb.c    
> 2006/01/07 00:26:11   1.1
> +++ netdev-2.6.git-upstream/drivers/net/bonding/bond_alb.c    
> 2006/01/09 19:55:12
> @@ -169,9 +169,9 @@
>               index = next_index;
>       }
>  
> -     _unlock_tx_hashtbl(bond);
> -
>       tlb_init_slave(slave);
> +
> +     _unlock_tx_hashtbl(bond);
>  }
>  
>  /* Must be called before starting the monitor timer */
> --- netdev-2.6.git-upstream/drivers/net/bonding/bonding.h     
> 2006/01/07 00:26:11   1.1
> +++ netdev-2.6.git-upstream/drivers/net/bonding/bonding.h     
> 2006/01/09 19:55:42
> @@ -22,8 +22,8 @@
>  #include "bond_3ad.h"
>  #include "bond_alb.h"
>  
> -#define DRV_VERSION  "3.0.0"
> -#define DRV_RELDATE  "November 8, 2005"
> +#define DRV_VERSION  "3.0.1"
> +#define DRV_RELDATE  "January 9, 2006"
>  #define DRV_NAME     "bonding"
>  #define DRV_DESCRIPTION      "Ethernet Channel Bonding Driver"
>  
> 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to