Alexander, I spent quite some time working on this problem and I found some interesting information, see below.
On 31/10/13(Thu) 17:20, Alexander Bluhm wrote: > On Thu, Oct 31, 2013 at 09:56:11AM +0100, Martin Pieuchot wrote: > > On 30/10/13(Wed) 16:48, Alexander Bluhm wrote: > > [...] > > > > I'm not sure to get the whole picture about this ioctl called in an > > interrupt context so feel free to correct me, but the only offending > > function is nd6_timer(), is that right? > > No, this code is also called from softnet interrupt context. An > autoconfigured IPv6 address can deleted by a timeout or by a router > advertisement. > > nd6_ra_input() -> prelist_update() -> nd6_prelist_add() -> > purge_detached() -> in6_purgeaddr() -> in6_leavegroup() By looking at the history of nd6_rtr.c, rev 1.54 tell us that this problem of adding/deleting a prefix or address in interrupt context is not new. Unfortunately the hack introduced at that time only works for a subset of prelist_update(): creating an addresses. But the story of prelist_update() does not end here. Since rev 1.55 was introduced to work around another problem caused by the fact that prelist_update() can run multiple times before the task scheduled to create an address has run. So I think that we should modify how/when prelist_update() is executed, > In theory we can ignore the incomming packet, we may always loose > packets and hope for a retransmit. But that would mean to avoid > the sleep and do a rollback. That would be a horrible diff. I'm not sure to understand the whole picture once again, but can't we defer the (whole) processing of a router advertisement message in a task? > > Finally since we are moving away from workq(9) to task(9) I'd suggest > > you to use the latter. > > That is easy, but it does not solve the problems you describe. I > have to call task_set() in in6_leavegroup() to set the if_index. > May be calling task_set() can be moved to somewhere else but that > does not look easy. > > I am still not happy with this diff although I think it does not > cause memory corruption. Look at this call path: > > if_detach() -> in6_ifdetach() -> in6_purgeaddr() -> in6_leavegroup() > > As I delay the SIOCDELMULTI after the interface is destroyed, > the ioctl is never called. That doesn't matter ;) If the interface is gone there's no hardware filter to update. > Perhaps we look at the problem from the wrong direction. Why do > some network drivers sleep when changing multicast groups? Did > they always do that or was something changed? Some driver sleep because that's how their author/porter wrote them since they are allowed to sleep in ioctls. Most (all) of the USB drivers are written this way, since synchronous transfer submission makes the caller sleep until the transaction is processed. But our problem here is that the nd6 code *abuse* the ioctl interface in interrupt context. So we could take another direction and design a new interface that does not allow to sleep when updating the multicast filters of an interface. But I believe it would requires much more changes and since this whole prelist_update() looks like a whole hack to me, I would prefer if someone could clean it. Martin