On 7/7/20 1:06 PM, Cong Wang wrote:
> On Mon, Jul 6, 2020 at 1:34 PM YU, Xiangning
> <xiangning...@alibaba-inc.com> wrote:
>>
>>
>>
>> On 7/6/20 1:10 PM, Cong Wang wrote:
>>> On Mon, Jul 6, 2020 at 11:11 AM YU, Xiangning
>>> <xiangning...@alibaba-inc.com> wrote:
>>>> +static int ltb_enqueue(struct sk_buff *skb, struct Qdisc *sch, spinlock_t 
>>>> *root_lock,
>>>> +                      struct sk_buff **to_free)
>>>> +{
>>>> +       struct ltb_sched *ltb = qdisc_priv(sch);
>>>> +       struct ltb_pcpu_sched *pcpu_q;
>>>> +       struct ltb_class *cl;
>>>> +       struct ltb_pcpu_data *pcpu = this_cpu_ptr(ltb->pcpu_data);
>>>> +       int cpu;
>>>> +
>>>> +       cpu = smp_processor_id();
>>>> +       pcpu_q = qdisc_priv(pcpu->qdisc);
>>>> +       ltb_skb_cb(skb)->cpu = cpu;
>>>> +
>>>> +       cl = ltb_classify(sch, ltb, skb);
>>>> +       if (unlikely(!cl)) {
>>>> +               kfree_skb(skb);
>>>> +               return NET_XMIT_DROP;
>>>> +       }
>>>> +
>>>> +       pcpu->active = true;
>>>> +       if (unlikely(kfifo_put(&cl->aggr_queues[cpu], skb) == 0)) {
>>>> +               kfree_skb(skb);
>>>> +               atomic64_inc(&cl->stat_drops);
>>>> +               return NET_XMIT_DROP;
>>>> +       }
>>>
>>>
>>> How do you prevent out-of-order packets?
>>>
>>
>> Hi Cong,
>>
>> That's a good question. In theory there will be out of order packets during 
>> aggregation. While keep in mind this is per-class aggregation, and it runs 
>> at a high frequency, that the chance to have any left over skbs from the 
>> same TCP flow on many CPUs is low.
>>
>> Also, based on real deployment experience, we haven't observed an increased 
>> out of order events even under heavy work load.
>>
> 
> Yeah, but unless you always classify packets into proper flows, there
> is always a chance to generate out-of-order packets here, which
> means the default configuration is flawed.
>

The key is to avoid classifying packets from a same flow into different 
classes. So we use socket priority to classify packets. It's always going to be 
correctly classified.

Not sure what do you mean by default configuration. But we create a shadow 
class when the qdisc is created. Before any other classes are created, all 
packets from any flow will be classified to this same shadow class, there won't 
be any incorrect classified packets either. 

> 
>>>
>>>> +static int ltb_init(struct Qdisc *sch, struct nlattr *opt,
>>> ...
>>>> +       ltb->default_cls = ltb->shadow_cls; /* Default hasn't been created 
>>>> */
>>>> +       tasklet_init(&ltb->fanout_tasklet, ltb_fanout_tasklet,
>>>> +                    (unsigned long)ltb);
>>>> +
>>>> +       /* Bandwidth balancer, this logic can be implemented in user-land. 
>>>> */
>>>> +       init_waitqueue_head(&ltb->bwbalancer_wq);
>>>> +       ltb->bwbalancer_task =
>>>> +           kthread_create(ltb_bw_balancer_kthread, ltb, "ltb-balancer");
>>>> +       wake_up_process(ltb->bwbalancer_task);
>>>
>>> Creating a kthread for each qdisc doesn't look good. Why do you
>>> need a per-qdisc kthread or even a kernel thread at all?
>>>
>>
>> We moved the bandwidth sharing out of the critical data path, that's why we 
>> use a kernel thread to balance the current maximum bandwidth used by each 
>> class periodically.
>>
>> This part could be implemented at as timer. What's your suggestion?
> 
> I doubt you can use a timer, as you call rtnl_trylock() there.
> Why not use a delayed work?
> 

Thank you for the suggestion, I will replace it with a delayed work and send 
out the new patch.

- Xiangning

> Thanks.
> 

Reply via email to