On Sun, Jul 16, 2017 at 11:29 PM, Tonghao Zhang <xiangxia.m....@gmail.com> wrote: > When calling the flow_free() to free the flow, we call many times > (cpu_possible_mask, eg. 128 as default) cpumask_next(). That will > take up our CPU usage if we call the flow_free() frequently. > When we put all packets to userspace via upcall, and OvS will send > them back via netlink to ovs_packet_cmd_execute(will call flow_free). > > The test topo is shown as below. VM01 sends TCP packets to VM02, > and OvS forward packtets. When testing, we use perf to report the > system performance. > > VM01 --- OvS-VM --- VM02 > > Without this patch, perf-top show as below: The flow_free() is > 3.02% CPU usage. > > 4.23% [kernel] [k] _raw_spin_unlock_irqrestore > 3.62% [kernel] [k] __do_softirq > 3.16% [kernel] [k] __memcpy > 3.02% [kernel] [k] flow_free > 2.42% libc-2.17.so [.] __memcpy_ssse3_back > 2.18% [kernel] [k] copy_user_generic_unrolled > 2.17% [kernel] [k] find_next_bit > > When applied this patch, perf-top show as below: Not shown on > the list anymore. > > 4.11% [kernel] [k] _raw_spin_unlock_irqrestore > 3.79% [kernel] [k] __do_softirq > 3.46% [kernel] [k] __memcpy > 2.73% libc-2.17.so [.] __memcpy_ssse3_back > 2.25% [kernel] [k] copy_user_generic_unrolled > 1.89% libc-2.17.so [.] _int_malloc > 1.53% ovs-vswitchd [.] xlate_actions > > With this patch, the TCP throughput(we dont use Megaflow Cache > + Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec > (maybe ~10% performance imporve). > > This patch adds cpumask struct, the cpu_used_mask stores the cpu_id > that the flow used. And we only check the flow_stats on the cpu we > used, and it is unncessary to check all possible cpu when getting, > cleaning, and updating the flow_stats. Adding the cpu_used_mask to > sw_flow struct does’t increase the cacheline number. > > Signed-off-by: Tonghao Zhang <xiangxia.m....@gmail.com>
Patch looks good to me. I have one comment. > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c > index ea7a807..324a246 100644 > --- a/net/openvswitch/flow_table.c > +++ b/net/openvswitch/flow_table.c > @@ -98,6 +98,9 @@ struct sw_flow *ovs_flow_alloc(void) > > RCU_INIT_POINTER(flow->stats[0], stats); > > + cpumask_clear(&flow->cpu_used_mask); I do not see need to clear the cpumask when flow is already zeroed. > + cpumask_set_cpu(0, &flow->cpu_used_mask); > + > return flow; > err: > kmem_cache_free(flow_cache, flow); > @@ -141,7 +144,7 @@ static void flow_free(struct sw_flow *flow) > if (flow->sf_acts) > ovs_nla_free_flow_actions((struct sw_flow_actions __force > *)flow->sf_acts); > /* We open code this to make sure cpu 0 is always considered */ > - for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, > cpu_possible_mask)) > + for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, > &flow->cpu_used_mask)) > if (flow->stats[cpu]) > kmem_cache_free(flow_stats_cache, > (struct flow_stats __force > *)flow->stats[cpu]); > -- > 1.8.3.1 >