----- Original Message -----
On 2019/9/3 14:06, Jason Wang wrote:
On 2019/9/3 下午1:42, Yang Yingliang wrote:
On 2019/9/3 11:03, Jason Wang wrote:
On 2019/9/3 上午9:45, Yang Yingliang wrote:
On 2019/9/2 13:32, Jason Wang wrote:
On 2019/8/23 下午5:36, Yang Yingliang wrote:
On 2019/8/23 11:05, Jason Wang wrote:
----- Original Message -----
On 2019/8/22 14:07, Yang Yingliang wrote:
On 2019/8/22 10:13, Jason Wang wrote:
On 2019/8/20 上午10:28, Jason Wang wrote:
On 2019/8/20 上午9:25, David Miller wrote:
From: Yang Yingliang <yangyingli...@huawei.com>
Date: Mon, 19 Aug 2019 21:31:19 +0800
Call tun_attach() after register_netdevice() to make sure
tfile->tun
is not published until the netdevice is registered. So the
read/write
thread can not use the tun pointer that may freed by
free_netdev().
(The tun and dev pointer are allocated by
alloc_netdev_mqs(), they
can
be freed by netdev_freemem().)
register_netdevice() must always be the last operation in
the order of
network device setup.
At the point register_netdevice() is called, the device is
visible
globally
and therefore all of it's software state must be fully
initialized and
ready for us.
You're going to have to find another solution to these
problems.
The device is loosely coupled with sockets/queues. Each
side is
allowed to be go away without caring the other side. So
in this
case, there's a small window that network stack think the
device has
one queue but actually not, the code can then safely drop
them.
Maybe it's ok here with some comments?
Or if not, we can try to hold the device before tun_attach
and drop
it after register_netdevice().
Hi Yang:
I think maybe we can try to hold refcnt instead of playing
real num
queues here. Do you want to post a V4?
I think the refcnt can prevent freeing the memory in this
case.
When register_netdevice() failed, free_netdev() will be called
directly,
dev->pcpu_refcnt and dev are freed without checking refcnt of
dev.
How about using patch-v1 that using a flag to check whether the
device
registered successfully.
As I said, it lacks sufficient locks or barriers. To be clear, I
meant
something like (compile-test only):
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index db16d7a13e00..e52678f9f049 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2828,6 +2828,7 @@ static int tun_set_iff(struct net *net,
struct file *file, struct ifreq *ifr)
(ifr->ifr_flags & TUN_FEATURES);
INIT_LIST_HEAD(&tun->disabled);
+ dev_hold(dev);
err = tun_attach(tun, file, false,
ifr->ifr_flags & IFF_NAPI,
ifr->ifr_flags &
IFF_NAPI_FRAGS);
if (err < 0)
@@ -2836,6 +2837,7 @@ static int tun_set_iff(struct net *net,
struct file *file, struct ifreq *ifr)
err = register_netdevice(tun->dev);
if (err < 0)
goto err_detach;
+ dev_put(dev);
}
netif_carrier_on(tun->dev);
@@ -2852,11 +2854,13 @@ static int tun_set_iff(struct net *net,
struct file *file, struct ifreq *ifr)
return 0;
err_detach:
+ dev_put(dev);
tun_detach_all(dev);
/* register_netdevice() already called
tun_free_netdev() */
goto err_free_dev;
err_free_flow:
+ dev_put(dev);
tun_flow_uninit(tun);
security_tun_dev_free_security(tun->security);
err_free_stat:
What's your thought?
The dev pointer are freed without checking the refcount in
free_netdev() called by err_free_dev
path, so I don't understand how the refcount protects this
pointer.
The refcount are guaranteed to be zero there, isn't it?
No, it's not.
err_free_dev:
free_netdev(dev);
void free_netdev(struct net_device *dev)
{
...
/* pcpu_refcnt can be freed without checking refcount */
free_percpu(dev->pcpu_refcnt);
dev->pcpu_refcnt = NULL;
/* Compatibility with error handling in drivers */
if (dev->reg_state == NETREG_UNINITIALIZED) {
/* dev can be freed without checking refcount */
netdev_freemem(dev);
return;
}
...
}
Right, but what I meant is in my patch, when code reaches
free_netdev() the refcnt is zero. What did I miss?
Yes, but it can't fix the UAF problem.
Well, it looks to me that the dev_put() in tun_put() won't release the
device in this case.
The device is not released in tun_put().
This is how the UAF occurs:
CPUA CPUB
tun_set_iff()
alloc_netdev_mqs()
tun_attach()
tun_chr_read_iter()
tun_get()
tun_do_read()
tun_ring_recv()
register_netdevice() <-- inject error
goto err_detach
tun_detach_all() <-- set RCV_SHUTDOWN
free_netdev() <-- called from
err_free_dev path
netdev_freemem() <-- free the memory
without check refcount
(In this path, the refcount cannot prevent
freeing the memory of dev, and the memory
will be used by dev_put() called by
tun_chr_read_iter() on CPUB.)
(Break from
tun_ring_recv(),
because
RCV_SHUTDOWN
is set)
tun_put()
dev_put() <--
use the
memory freed by
netdev_freemem()
My bad, thanks for the patience. Since all evil come from the
tfile->tun, how about delay the publishing of tfile->tun until the
success of registration to make sure dev_put() and dev_hold() work.
(Compile test only)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index db16d7a13e00..aab0be40d443 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -787,7 +787,8 @@ static void tun_detach_all(struct net_device *dev)
}
static int tun_attach(struct tun_struct *tun, struct file *file,
- bool skip_filter, bool napi, bool napi_frags)
+ bool skip_filter, bool napi, bool napi_frags,
+ bool publish_tun)
{
struct tun_file *tfile = file->private_data;
struct net_device *dev = tun->dev;
@@ -870,7 +871,8 @@ static int tun_attach(struct tun_struct *tun,
struct file *file,
* initialized tfile; otherwise we risk using half-initialized
* object.
*/
- rcu_assign_pointer(tfile->tun, tun);
+ if (publish_tun)
+ rcu_assign_pointer(tfile->tun, tun);
rcu_assign_pointer(tun->tfiles[tun->numqueues], tfile);
tun->numqueues++;
tun_set_real_num_queues(tun);
@@ -2730,7 +2732,7 @@ static int tun_set_iff(struct net *net, struct
file *file, struct ifreq *ifr)
err = tun_attach(tun, file, ifr->ifr_flags & IFF_NOFILTER,
ifr->ifr_flags & IFF_NAPI,
- ifr->ifr_flags & IFF_NAPI_FRAGS);
+ ifr->ifr_flags & IFF_NAPI_FRAGS, true);
if (err < 0)
return err;
@@ -2829,13 +2831,17 @@ static int tun_set_iff(struct net *net,
struct file *file, struct ifreq *ifr)
INIT_LIST_HEAD(&tun->disabled);
err = tun_attach(tun, file, false, ifr->ifr_flags & IFF_NAPI,
- ifr->ifr_flags & IFF_NAPI_FRAGS);
+ ifr->ifr_flags & IFF_NAPI_FRAGS, false);
if (err < 0)
goto err_free_flow;
err = register_netdevice(tun->dev);
if (err < 0)
goto err_detach;
+ /* free_netdev() won't check refcnt, to aovid race
+ * with dev_put() we need publish tun after registration.
+ */
+ rcu_assign_pointer(tfile->tun, tun);
}
netif_carrier_on(tun->dev);
@@ -2978,7 +2984,7 @@ static int tun_set_queue(struct file *file,
struct ifreq *ifr)
if (ret < 0)
goto unlock;
ret = tun_attach(tun, file, false, tun->flags & IFF_NAPI,
- tun->flags & IFF_NAPI_FRAGS);
+ tun->flags & IFF_NAPI_FRAGS, true);
} else if (ifr->ifr_flags & IFF_DETACH_QUEUE) {
tun = rtnl_dereference(tfile->tun);
if (!tun || !(tun->flags & IFF_MULTI_QUEUE) ||
tfile->detached)