On 03/28/2018 07:41 PM, Nikita Shirokov wrote:
>>> On 03/26/2018 05:36 PM, Nikita V. Shirokov wrote:
>>>      bpf: Add sock_ops R/W access to ipv4 tos
>>>
>>>      Sample usage for tos:
>>>
>>>        bpf_getsockopt(skops, SOL_IP, IP_TOS, &v, sizeof(v))
>>>
>>>      where skops is a pointer to the ctx (struct bpf_sock_ops).
>>>
>>> Signed-off-by: Nikita V. Shirokov <tehn...@fb.com>
>>> ---
>>>   net/core/filter.c | 35 +++++++++++++++++++++++++++++++++++
>>>   1 file changed, 35 insertions(+)
>>>
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index 00c711c..afd8255 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -3462,6 +3462,27 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern 
>>> *, bpf_sock,
>>>                         ret = -EINVAL;
>>>                 }
>>>   #ifdef CONFIG_INET
>>> +     } else if (level == SOL_IP) {
>>> +             if (optlen != sizeof(int) || sk->sk_family != AF_INET)
>>> +                     return -EINVAL;
>>> +
>>> +             val = *((int *)optval);
>>> +             /* Only some options are supported */
>>> +             switch (optname) {
>>> +             case IP_TOS:
>>> +                     if (val < -1 || val > 0xff) {
>>> +                             ret = -EINVAL;
>>> +                     } else {
>>> +                             struct inet_sock *inet = inet_sk(sk);
>>> +
>>> +                             if (val == -1)
>>> +                                     val = 0;
>>> +                             inet->tos = val;
>>
>> Should this not have the exact same semantics given the helper resembles
>> the normal setsockopt? do_ip_setsockopt() does the following when setting
>> IP_TOS:
>>
>>         case IP_TOS:    /* This sets both TOS and Precedence */
>>                 if (sk->sk_type == SOCK_STREAM) {
>>                         val &= ~INET_ECN_MASK;
>>                         val |= inet->tos & INET_ECN_MASK;
>>                 }
>>                 if (inet->tos != val) {
>>                         inet->tos = val;
>>                         sk->sk_priority = rt_tos2priority(val);
>>                         sk_dst_reset(sk);
>>                 }
>>                 break;
>>
>> E.g. why we don't need to set sk->sk_priority as well or reset the dst
>> entry here?
> 
> it feels like initially (w/ commit for IP_TOS in ip_sockglue.c) there were 
> some usecase in mind
> where reflection of tos to prio was needed + some policy based routing (thats 
> why dst_reset).
> but e.g. for ipv6 (IPV6_TCLASS, same as TOS but in ipv6 world) we do just 
> this - set new tclass value
> and call it the day. in my opinion this aproach is more flexible (e.g. we 
> have separate
> bpf_setsockopt for SOL_PRIORITY) as it did only what we want (i can imagine 
> few usecases
> where we want just to change TOS w/o changing priority)

Ok, fair point, that way the behavior is exactly the same as in v6 case.

Applied to bpf-next, thanks Nikita!

Reply via email to