On 10/10/2018 10:14 AM, Eric Dumazet wrote: > > > On 10/10/2018 09:18 AM, Shannon Nelson wrote: >> On 10/9/2018 7:17 PM, Eric Dumazet wrote: >>> >>> >>> On 10/09/2018 07:11 PM, Shannon Nelson wrote: >>>> >>>> Hence the reason we sent this as an RFC a couple of weeks ago. We got no >>>> response, so followed up with this patch in order to get some input. Do >>>> you have any suggestions for how we might accomplish this in a less ugly >>>> way? >>> >>> I dunno, maybe a modern way for all these very specific needs would be to >>> use an eBPF >>> hook to implement whatever combination of RPS/RFS/what_have_you >>> >>> Then, we no longer have to review what various strategies are used by users. >> >> We're trying to make use of an existing useful feature that was designed for >> exactly this kind of problem. It is already there and no new user training >> is needed. We're actually fixing what could arguably be called a bug since >> the /sys/class/net/<dev>/queues/rx-0/rps_cpus entry exists for vlan devices >> but currently doesn't do anything. We're also addressing a security concern >> related to the recent L1TF excitement. >> >> For this case, we want to target the network stack processing to happen on a >> certain subset of CPUs. With admittedly only a cursory look through eBPF, I >> don't see an obvious way to target the packet processing to an alternative >> CPU, unless we add yet another field to the skb that eBPF/XDP could fill and >> then query that field in the same time as we currently check get_rps_cpu(). >> But adding to the skb is usually frowned upon unless absolutely necessary, >> and this seems like a duplication of what we already have with RPS, so why >> add a competing feature? >> >> Back to my earlier question: are there any suggestions for how we might >> accomplish this in a less ugly way? > > > What if you want to have efficient multi queue processing ? > The Vlan device could have multiple RX queues, but you forced queue_mapping=0 > > Honestly, RPS & RFS show their age and complexity (look at > net/core/net-sysfs.c ...) > > We should not expand it, we should put in place a new infrastructure, fully > expandable. > With socket lookups, we even can avoid having a hashtable for flow > information, removing > one cache miss, and removing flow collisions. > > eBPF seems perfect to me. >
Latest tree has a sk_lookup() helper supported in 'tc' layer now to lookup the socket. And XDP has support for a "cpumap" object that allows redirect to remote CPUs. Neither was specifically designed for this but I suspect with some extra work these might be what is needed. I would start by looking at bpf_sk_lookup() in filter.c and the cpumap type in ./kernel/bpf/cpumap.c, also in general sk_lookup from XDP layer will likely be needed shortly anyways. > It is time that we stop adding core infra that most users do not need/use. > (RPS and RFS are default off) >