Sat, Jan 06, 2018 at 06:41:18PM CET, dsah...@gmail.com wrote: >On 1/6/18 1:07 AM, Jiri Pirko wrote: >> Sat, Jan 06, 2018 at 04:57:21AM CET, dsah...@gmail.com wrote: >>> On 1/5/18 4:09 PM, Jiri Pirko wrote: >>>> From: Jiri Pirko <j...@mellanox.com> >>>> >>>> Currently the filters added to qdiscs are independent. So for example if >>>> you >>>> have 2 netdevices and you create ingress qdisc on both and you want to add >>>> identical filter rules both, you need to add them twice. This patchset >>>> makes this easier and mainly saves resources allowing to share all filters >>>> within a qdisc - I call it a "filter block". Also this helps to save >>>> resources when we do offload to hw for example to expensive TCAM. >>>> >>>> So back to the example. First, we create 2 qdiscs. Both will share >>>> block number 22. "22" is just an identification. If we don't pass any >>>> block number, a new one will be generated by kernel: >>>> >>>> $ tc qdisc add dev ens7 ingress block 22 >>>> ^^^^^^^^ >>>> $ tc qdisc add dev ens8 ingress block 22 >>>> ^^^^^^^^ >>>> >>>> Now if we list the qdiscs, we will see the block index in the output: >>>> >>>> $ tc qdisc >>>> qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22 >>>> qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22 >>>> >>>> >>>> To make is more visual, the situation looks like this: >>>> >>>> ens7 ingress qdisc ens7 ingress qdisc >>>> | | >>>> | | >>>> +----------> block 22 <----------+ >>>> >>>> Unlimited number of qdiscs may share the same block. >>>> >>>> Now we can add filter using the block index: >>>> >>>> $ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 >>>> action drop >>>> >>>> >>>> Note we cannot use the qdisc for filter manipulations for shared blocks: >>>> >>>> $ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip >>>> 192.168.100.2 action drop >>>> Error: Cannot work with shared block, please use block index. >>>> >>>> >>>> We will see the same output if we list filters for ingress qdisc of >>>> ens7 and ens8, also for the block 22: >>>> >>>> $ tc filter show block 22 >>>> filter block 22 protocol ip pref 25 flower chain 0 >>>> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1 >>>> ... >>>> >>>> $ tc filter show dev ens7 ingress >>>> filter block 22 protocol ip pref 25 flower chain 0 >>>> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1 >>>> ... >>>> >>>> $ tc filter show dev ens8 ingress >>>> filter block 22 protocol ip pref 25 flower chain 0 >>>> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1 >>>> ... >>> >>> I like the API and output shown here, but I am not getting that with the >>> patches. >>> >>> In this example, I am using 42 for the block id: >>> >>> $ tc qdisc show dev eth2 >>> qdisc mq 0: root >>> qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 >>> 1 1 1 >>> qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 >>> 1 1 1 >>> qdisc ingress ffff: parent ffff:fff1 block 42 >>> >>> It allows me to add a filter using the device: >>> $ tc filter add dev eth2 ingress protocol ip pref 1 flower dst_ip >>> 192.168.101.2 action drop >>> $ echo $? >>> 0 >> >> Yes, because the block is not shared yet. You have it only for one >> qdisc. As long as you have that, the "filter add dev" api still works. >> It stops working when you add another qdisc to that block. > >Interesting. > >Once I add the block to another qdisc I do get an error: > >$ tc filter add dev eth2 ingress protocol ip pref 1 flower dst_ip >192.168.100.2 action drop >Error: Cannot work with shared block, please use block index. > >Can you change that to something like: "This filter block is shared. >Please use the block index to make changes."
Ok. Sounds reasonable. > > >> >> >>> >>> And it modifies the shared block: >>> $ tc filter show block 42 >>> filter pref 1 flower chain 0 >>> filter pref 1 flower chain 0 handle 0x1 >>> eth_type ipv4 >>> dst_ip 192.168.100.2 >>> not_in_hw >>> action order 1: gact action drop >>> random type none pass val 0 >>> index 2 ref 1 bind 1 >>> >>> filter pref 1 flower chain 0 handle 0x2 >>> eth_type ipv4 >>> dst_ip 192.168.101.2 >>> not_in_hw >>> action order 1: gact action drop >>> random type none pass val 0 >>> index 3 ref 1 bind 1 >>> >>> filter pref 25 flower chain 0 >>> filter pref 25 flower chain 0 handle 0x1 >>> eth_type ipv4 >>> dst_ip 192.168.0.0/16 >>> not_in_hw >>> action order 1: gact action drop >>> random type none pass val 0 >>> index 1 ref 1 bind 1 >>> >>> Notice the header does not give the 'filter block N protocol' part. I >>> don't get that using the device either (tc filter show dev eth2 ingress). >> >> That is correct. Check the print_filter function in tc/tc_filter.c. It >> works with "filter_ifindex" and with my patch with "filter_block_index". >> That means that if the value for the filter dumped actually differs from >> what you passed on the command line, it prints it. >> >> Once you actually share the block with another qdisc, you will see >> "block N" >> >> >>> >>> Something else I noticed is that I do not get an error message if I pass >>> an invalid block id: >>> >>> $ tc filter show block 22 >>> $ echo $? >>> 0 >>> $ tc qdisc show | grep block >>> qdisc ingress ffff: dev eth2 parent ffff:fff1 block 42 >> >> Yeah, I will try to fix this. The thing is, this is not error by kernel >> but by the userspace. Kernel is perfectly ok with invalid device or >> block index, it just does not dump anything and I would leave it like >> that. I have to somehow check the validity of block_index in userspace. >> Not sure how now. > >Ok. I saw a response about idr_alloc_ext. I have idea having new type of message: rtnl_register(PF_UNSPEC, RTM_GETBLOCK, tc_getblock_doit, tc_getblock_dumpit, 0); Also, block creation would send RTM_NEWBLOCK and block deletion would send RTM_DELBLOCK. Not needed now, but I guess it would be nice to have notifications. > >Here's another one: adding a filter to an unknown block id: > >$ tc filter add block 66 ingress protocol ip pref 1 flower dst_ip >192.168.100.2 action drop >RTNETLINK answers: Invalid argument >We have an error talking to the kernel > > >Can you add a proper extack message for that case? Yeah, I will. There is couple of other places in the neibour code that need extact (like qdisq lookup fail). But those I will address in a separate patchset. > >Thanks,