On 12/24/2017 07:49 PM, Wei Yongjun wrote:
> When dev_requeue_skb() is called with bluked skb list, only the
> first skb of the list will be requeued to qdisc layer, and leak
> the others without free them.
> 
> TCP is broken due to skb leak since no free skb will be considered
> as still in the host queue and never be retransmitted. This happend
> when dev_requeue_skb() called from qdisc_restart().
>   qdisc_restart
>   |-- dequeue_skb
>   |-- sch_direct_xmit()
>       |-- dev_requeue_skb() <-- skb may bluked
> 
> Fix dev_requeue_skb() to requeue the full bluked list.
> 
> Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback")
> Signed-off-by: Wei Yongjun <weiyongj...@huawei.com>
> ---
> v1 -> v2: add net-next prefix
> ---

First, thanks for tracking this down.

> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 981c08f..0df2dbf 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -111,10 +111,16 @@ static inline void qdisc_enqueue_skb_bad_txq(struct 
> Qdisc *q,
>  
>  static inline int __dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
>  {
> -     __skb_queue_head(&q->gso_skb, skb);
> -     q->qstats.requeues++;
> -     qdisc_qstats_backlog_inc(q, skb);
> -     q->q.qlen++;    /* it's still part of the queue */
> +     while (skb) {
> +             struct sk_buff *next = skb->next;
> +
> +             __skb_queue_tail(&q->gso_skb, skb);

Was the change from __skb_queue_head to __skb_queue_tail here
intentional? We should re-queue packets to the head of the list.

> +             q->qstats.requeues++;
> +             qdisc_qstats_backlog_inc(q, skb);
> +             q->q.qlen++;    /* it's still part of the queue */
> +
> +             skb = next;
> +     }
>       __netif_schedule(q);
>  
>       return 0;
> @@ -124,13 +130,20 @@ static inline int dev_requeue_skb_locked(struct sk_buff 
> *skb, struct Qdisc *q)
>  {
>       spinlock_t *lock = qdisc_lock(q);
>  
> -     spin_lock(lock);
> -     __skb_queue_tail(&q->gso_skb, skb);
> -     spin_unlock(lock);
> +     while (skb) {
> +             struct sk_buff *next = skb->next;
> +
> +             spin_lock(lock);

In this case I suspect its better to move the lock to be around the
while loop rather than grab and drop it repeatedly. I don't have
any data at this point so OK either way. Assuming other head/tail
comment is addressed.

> +             __skb_queue_tail(&q->gso_skb, skb);

Same here *_tail should be *_head?

> +             spin_unlock(lock);
> +
> +             qdisc_qstats_cpu_requeues_inc(q);
> +             qdisc_qstats_cpu_backlog_inc(q, skb);
> +             qdisc_qstats_cpu_qlen_inc(q);
> +
> +             skb = next;
> +     }
>  
> -     qdisc_qstats_cpu_requeues_inc(q);
> -     qdisc_qstats_cpu_backlog_inc(q, skb);
> -     qdisc_qstats_cpu_qlen_inc(q);
>       __netif_schedule(q);
>  
>       return 0;
> 

Reply via email to