On Mon, Sep 08, 2025 at 01:47:24PM -0700, Calvin Owens wrote:
> On Friday 09/05 at 10:25 -0700, Breno Leitao wrote:
> > commit efa95b01da18 ("netpoll: fix use after free") incorrectly
> > ignored the refcount and prematurely set dev->npinfo to NULL during
> > netpoll cleanup, leading to improper behavior and memory leaks.
> > 
> > Scenario causing lack of proper cleanup:
> > 
> > 1) A netpoll is associated with a NIC (e.g., eth0) and netdev->npinfo is
> >    allocated, and refcnt = 1
> >    - Keep in mind that npinfo is shared among all netpoll instances. In
> >      this case, there is just one.
> > 
> > 2) Another netpoll is also associated with the same NIC and
> >    npinfo->refcnt += 1.
> >    - Now dev->npinfo->refcnt = 2;
> >    - There is just one npinfo associated to the netdev.
> > 
> > 3) When the first netpolls goes to clean up:
> >    - The first cleanup succeeds and clears np->dev->npinfo, ignoring
> >      refcnt.
> >      - It basically calls `RCU_INIT_POINTER(np->dev->npinfo, NULL);`
> >    - Set dev->npinfo = NULL, without proper cleanup
> >    - No ->ndo_netpoll_cleanup() is either called
> > 
> > 4) Now the second target tries to clean up
> >    - The second cleanup fails because np->dev->npinfo is already NULL.
> >      * In this case, ops->ndo_netpoll_cleanup() was never called, and
> >        the skb pool is not cleaned as well (for the second netpoll
> >        instance)
> >   - This leaks npinfo and skbpool skbs, which is clearly reported by
> >     kmemleak.
> > 
> > Revert commit efa95b01da18 ("netpoll: fix use after free") and adds
> > clarifying comments emphasizing that npinfo cleanup should only happen
> > once the refcount reaches zero, ensuring stable and correct netpoll
> > behavior.
> 
> This makes sense to me.
> 
> Just curious, did you try the original OOPS reproducer?
> https://lore.kernel.org/lkml/96b940137a50e5c387687bb4f57de8b0435a653f.1404857349.git.de...@googlers.com/

Yes, but I have not been able to reproduce the problem at all.
I've have tested it using netdevsim, and here is a quick log of what I
run:

        + modprobe netconsole
        + modprobe bonding mode=4
        [   86.540950] Warning: miimon must be specified, otherwise bonding 
will not detect link failure, speed and duplex which are essential for 802.3ad 
operation
        [   86.541617] Forcing miimon to 100msec
        [   86.541893] MII link monitoring set to 100 ms
        + echo +bond0
        [   86.547802] bonding: bond0 is being created...
        + ifconfig bond0 192.168.56.3 up
        + mkdir /sys/kernel/config/netconsole/blah
        + echo 0
        [   86.614772] netconsole: network logging has already stopped
        ./run.sh: line 19: echo: write error: Invalid argument
        + echo bond0
        + echo 192.168.56.42
        + echo 1
        [   86.622318] netconsole: netconsole: local port 6665
        [   86.622550] netconsole: netconsole: local IPv4 address 0.0.0.0
        [   86.622819] netconsole: netconsole: interface name 'bond0'
        [   86.623038] netconsole: netconsole: local ethernet address 
'00:00:00:00:00:00'
        [   86.623466] netconsole: netconsole: remote port 6666
        [   86.623675] netconsole: netconsole: remote IPv4 address 192.168.56.42
        [   86.623924] netconsole: netconsole: remote ethernet address 
ff:ff:ff:ff:ff:ff
        [   86.624264] netpoll: netconsole: local IP 192.168.56.3
        [   86.643174] netconsole: network logging started
        + ifenslave bond0 eth1
        [   86.659899] bond0: (slave eth1): Enslaving as a backup interface 
with a down link
        + ifenslave bond0 eth2
        [   86.687630] bond0: (slave eth2): Enslaving as a backup interface 
with a down link
        + sleep 3
        + ifenslave -d bond0 eth1
        [   89.735701] bond0: (slave eth1): Releasing backup interface
        [   89.737239] bond0: (slave eth1): the permanent HWaddr of slave - 
06:44:84:94:87:c7 - is still in use by bond - set the HWaddr of slave to a 
different address to avoid conflicts
        + sleep 1
        + echo -bond0
        [   90.798676] bonding: bond0 is being deleted...
        [   90.815595] netconsole: network logging stopped on interface bond0 
as it unregistered
        [   90.816416] bond0 (unregistering): (slave eth2): Releasing backup 
interface
        [   90.863054] bond0 (unregistering): Released all slaves
        + ls -lR /
        + tail -30
        <snip>

        + echo +bond0
        ./run.sh: line 39: /sys/class/net/bonding_masters: Permission denied
        + ifconfig bond0 192.168.56.3 up
        SIOCSIFADDR: No such device
        bond0: ERROR while getting interface flags: No such device
        bond0: ERROR while

Reply via email to