Ping Dave,

Since there doesn't seem to be any new ideas forthcoming, can we
please decide on either one of my two sumbitted patches?

Thanks,
Miklos

> [CC'd Al Viro and Alan Cox, restored patch]
> 
> > > There are races involving the garbage collector, that can throw away
> > > perfectly good packets with AF_UNIX sockets in them.
> > > 
> > > The problems arise when a socket goes from installed to in-flight or
> > > vice versa during garbage collection.  Since gc is done with a
> > > spinlock held, this only shows up on SMP.
> > > 
> > > Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
> > 
> > I'm going to hold off on this one for now.
> > 
> > Holding all of the read locks kind of defeats the purpose of using
> > the per-socket lock.
> > 
> > Can't you just lock purely around the receive queue operation?
> 
> That's already protected by the receive queue spinlock.  The race
> however happens _between_ pushing the root set and marking of the
> in-flight but reachable sockets.
> 
> If in that space any of the AF_UNIX sockets goes from in-flight to
> installed into a file descriptor, the garbage collector can miss it.
> If we want to protect against this using unix_sk(s)->readlock, then we
> have to hold all of them for the duration of the marking.
> 
> Al, Alan, you have more experience with this piece of code.  Do you
> have better ideas about how to fix this?
> 
> Thanks,
> Miklos
> 
> > Index: linux-2.6.22-rc2/net/unix/garbage.c
> > ===================================================================
> > --- linux-2.6.22-rc2.orig/net/unix/garbage.c        2007-06-03 
> > 23:58:11.000000000 +0200
> > +++ linux-2.6.22-rc2/net/unix/garbage.c     2007-06-06 09:48:36.000000000 
> > +0200
> > @@ -186,7 +186,21 @@ void unix_gc(void)
> >  
> >     forall_unix_sockets(i, s)
> >     {
> > -           unix_sk(s)->gc_tree = GC_ORPHAN;
> > +           struct unix_sock *u = unix_sk(s);
> > +
> > +           u->gc_tree = GC_ORPHAN;
> > +
> > +           /*
> > +            * Hold ->readlock to protect against sockets going from
> > +            * in-flight to installed
> > +            *
> > +            * Can't sleep on this, because
> > +            *   a) we are under spinlock
> > +            *   b) skb_recv_datagram() could be waiting for a packet that
> > +            *      is to be sent by this thread
> > +            */
> > +           if (!mutex_trylock(&u->readlock))
> > +                   goto lock_failed;
> >     }
> >     /*
> >      *      Everything is now marked
> > @@ -207,8 +221,6 @@ void unix_gc(void)
> >  
> >     forall_unix_sockets(i, s)
> >     {
> > -           int open_count = 0;
> > -
> >             /*
> >              *      If all instances of the descriptor are not
> >              *      in flight we are in use.
> > @@ -218,10 +230,20 @@ void unix_gc(void)
> >              *      In this case (see unix_create1()) we set artificial
> >              *      negative inflight counter to close race window.
> >              *      It is trick of course and dirty one.
> > +            *
> > +            *      Get the inflight counter first, then the open
> > +            *      counter.  This avoids problems if racing with
> > +            *      sendmsg
> > +            *
> > +            *      If just created socket is not yet attached to
> > +            *      a file descriptor, assume open_count of 1
> >              */
> > +           int inflight_count = atomic_read(&unix_sk(s)->inflight);
> > +           int open_count = 1;
> > +
> >             if (s->sk_socket && s->sk_socket->file)
> >                     open_count = file_count(s->sk_socket->file);
> > -           if (open_count > atomic_read(&unix_sk(s)->inflight))
> > +           if (open_count > inflight_count)
> >                     maybe_unmark_and_push(s);
> >     }
> >  
> > @@ -300,6 +322,7 @@ void unix_gc(void)
> >                     spin_unlock(&s->sk_receive_queue.lock);
> >             }
> >             u->gc_tree = GC_ORPHAN;
> > +           mutex_unlock(&u->readlock);
> >     }
> >     spin_unlock(&unix_table_lock);
> >  
> > @@ -309,4 +332,19 @@ void unix_gc(void)
> >  
> >     __skb_queue_purge(&hitlist);
> >     mutex_unlock(&unix_gc_sem);
> > +   return;
> > +
> > + lock_failed:
> > +   {
> > +           struct sock *s1;
> > +           forall_unix_sockets(i, s1) {
> > +                   if (s1 == s) {
> > +                           spin_unlock(&unix_table_lock);
> > +                           mutex_unlock(&unix_gc_sem);
> > +                           return;
> > +                   }
> > +                   mutex_unlock(&unix_sk(s1)->readlock);
> > +           }
> > +           BUG();
> > +   }
> >  }
> > 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to