On 10/10/2018 10:14 AM, Eric Dumazet wrote:
> 
> 
> On 10/10/2018 09:18 AM, Shannon Nelson wrote:
>> On 10/9/2018 7:17 PM, Eric Dumazet wrote:
>>>
>>>
>>> On 10/09/2018 07:11 PM, Shannon Nelson wrote:
>>>>
>>>> Hence the reason we sent this as an RFC a couple of weeks ago.  We got no 
>>>> response, so followed up with this patch in order to get some input. Do 
>>>> you have any suggestions for how we might accomplish this in a less ugly 
>>>> way?
>>>
>>> I dunno, maybe a modern way for all these very specific needs would be to 
>>> use an eBPF
>>> hook to implement whatever combination of RPS/RFS/what_have_you
>>>
>>> Then, we no longer have to review what various strategies are used by users.
>>
>> We're trying to make use of an existing useful feature that was designed for 
>> exactly this kind of problem.  It is already there and no new user training 
>> is needed.  We're actually fixing what could arguably be called a bug since 
>> the /sys/class/net/<dev>/queues/rx-0/rps_cpus entry exists for vlan devices 
>> but currently doesn't do anything.  We're also addressing a security concern 
>> related to the recent L1TF excitement.
>>
>> For this case, we want to target the network stack processing to happen on a 
>> certain subset of CPUs.  With admittedly only a cursory look through eBPF, I 
>> don't see an obvious way to target the packet processing to an alternative 
>> CPU, unless we add yet another field to the skb that eBPF/XDP could fill and 
>> then query that field in the same time as we currently check get_rps_cpu().  
>> But adding to the skb is usually frowned upon unless absolutely necessary, 
>> and this seems like a duplication of what we already have with RPS, so why 
>> add a competing feature?
>>
>> Back to my earlier question: are there any suggestions for how we might 
>> accomplish this in a less ugly way?
> 
> 
> What if you want to have efficient multi queue processing ?
> The Vlan device could have multiple RX queues, but you forced queue_mapping=0
> 
> Honestly, RPS & RFS show their age and complexity (look at 
> net/core/net-sysfs.c ...)
> 
> We should not expand it, we should put in place a new infrastructure, fully 
> expandable.
> With socket lookups, we even can avoid having a hashtable for flow 
> information, removing
> one cache miss, and removing flow collisions.
> 
> eBPF seems perfect to me.
> 

Latest tree has a sk_lookup() helper supported in 'tc' layer now
to lookup the socket. And XDP has support for a "cpumap" object
that allows redirect to remote CPUs. Neither was specifically
designed for this but I suspect with some extra work these might
be what is needed.

I would start by looking at bpf_sk_lookup() in filter.c and the
cpumap type in ./kernel/bpf/cpumap.c, also in general sk_lookup
from XDP layer will likely be needed shortly anyways.

> It is time that we stop adding core infra that most users do not need/use.
> (RPS and RFS are default off)
> 

Reply via email to