[PATCH net 7/7] mlxsw: spectrum_buffers: Add a multicast pool for Spectrum-2

2019-04-09 Thread Ido Schimmel
In Spectrum-1, when a multicast packet is admitted to the shared buffer it increases the quotas of all the ports and {port, TC} to which it is forwarded to. The above means that multicast packets are accounted multiple times in the shared buffer and can therefore cause the associated shared buffer

[PATCH net 5/7] mlxsw: spectrum_router: Do not check VRF MAC address

2019-04-09 Thread Ido Schimmel
Commit 74bc99397438 ("mlxsw: spectrum_router: Veto unsupported RIF MAC addresses") enabled the driver to veto router interface (RIF) MAC addresses that it cannot support. This check should only be performed for interfaces for which the driver actually configures a RIF. A VRF upper is not one of th

[PATCH net 4/7] mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw workqueue

2019-04-09 Thread Ido Schimmel
The workqueue is used to periodically update the networking stack about activity / statistics of various objects such as neighbours and TC actions. It should not be called as part of memory reclaim path, so remove the WQ_MEM_RECLAIM flag. Fixes: 3d5479e92087 ("mlxsw: core: Remove deprecated creat

[PATCH net 2/7] mlxsw: core: Do not use WQ_MEM_RECLAIM for EMAD workqueue

2019-04-09 Thread Ido Schimmel
The EMAD workqueue is used to handle retransmission of EMAD packets that contain configuration data for the device's firmware. Given the workers need to allocate these packets and that the code is not called as part of memory reclaim path, remove the WQ_MEM_RECLAIM flag. Fixes: d965465b60ba ("mlx

[PATCH net 6/7] selftests: mlxsw: Test VRF MAC vetoing

2019-04-09 Thread Ido Schimmel
Test that it is possible to set an IP address on a VRF and that it is not vetoed. Signed-off-by: Ido Schimmel Acked-by: Jiri Pirko --- .../selftests/drivers/net/mlxsw/rtnetlink.sh | 20 +++ 1 file changed, 20 insertions(+) diff --git a/tools/testing/selftests/drivers/net/mlxsw

[PATCH net 0/7] mlxsw: Various fixes

2019-04-09 Thread Ido Schimmel
This patchset contains various small fixes for mlxsw. Patch #1 fixes a warning generated by switchdev core when the driver fails to insert an MDB entry in the commit phase. Patches #2-#4 fix a warning in check_flush_dependency() that can be triggered when a work item in a WQ_MEM_RECLAIM workqueue

[PATCH net 1/7] mlxsw: spectrum_switchdev: Add MDB entries in prepare phase

2019-04-09 Thread Ido Schimmel
The driver cannot guarantee in the prepare phase that it will be able to write an MDB entry to the device. In case the driver returned success during the prepare phase, but then failed to add the entry in the commit phase, a WARNING [1] will be generated by the switchdev core. Fix this by doing th

[PATCH net 3/7] mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw ordered workqueue

2019-04-09 Thread Ido Schimmel
The ordered workqueue is used to offload various objects such as routes and neighbours in the order they are notified. It should not be called as part of memory reclaim path, so remove the WQ_MEM_RECLAIM flag. This can also result in a warning [1], if a worker tries to flush a non-WQ_MEM_RECLAIM w

[PATCH bpf-next v2] libbpf: fix crash in XDP socket part with new larger BPF_LOG_BUF_SIZE

2019-04-09 Thread Magnus Karlsson
In commit da11b417583e ("libbpf: teach libbpf about log_level bit 2"), the BPF_LOG_BUF_SIZE was increased to 16M. The XDP socket part of libbpf allocated the log_buf on the stack, but for the new 16M buffer size this is not going to work. Change the code so it uses a 16K buffer instead. Signed-off

Re: [PATCH rdma-next 00/12] Move IB representors to single IB device multiple ports

2019-04-09 Thread Leon Romanovsky
On Thu, Apr 04, 2019 at 08:42:38PM +0300, Leon Romanovsky wrote: > On Thu, Apr 04, 2019 at 10:02:21AM -0300, Jason Gunthorpe wrote: > > On Thu, Mar 28, 2019 at 03:27:30PM +0200, Leon Romanovsky wrote: > > > From: Leon Romanovsky > > > > > > >From Mark, > > > > > > Hi, > > > > > > This series start

Re: [PATCH v2 net-next 08/18] ipv4: Refactor fib_check_nh

2019-04-09 Thread Govindarajulu Varadarajan
On Tue, Apr 9, 2019 at 7:13 PM David Ahern wrote: > > On 4/9/19 5:08 PM, Govindarajulu Varadarajan wrote: > > On Fri, Apr 5, 2019 at 4:32 PM David Ahern wrote: > >> > >> From: David Ahern > >> > >> fib_check_nh is currently huge covering multiple uses cases - device only, > >> device + gateway,

[PATCH bpf-next] tools/bpftool: show btf id in program information

2019-04-09 Thread Prashant Bhole
Let's add a way to know whether a program has btf context. Patch adds 'btf_id' in the output of program listing. When btf_id is present, it means program has btf context. Sample output: user@test# bpftool prog list 25: xdp name xdp_prog1 tag 539ec6ce11b52f98 gpl loaded_at 2019-04-10T11:

Re: [PATCH v2 net-next 08/18] ipv4: Refactor fib_check_nh

2019-04-09 Thread David Ahern
On 4/9/19 5:08 PM, Govindarajulu Varadarajan wrote: > On Fri, Apr 5, 2019 at 4:32 PM David Ahern wrote: >> >> From: David Ahern >> >> fib_check_nh is currently huge covering multiple uses cases - device only, >> device + gateway, and device + gateway with ONLINK. The next patch adds >> validation

Re: [PATCH v2 net-next 08/18] ipv4: Refactor fib_check_nh

2019-04-09 Thread David Ahern
On 4/9/19 5:08 PM, Govindarajulu Varadarajan wrote: > On Fri, Apr 5, 2019 at 4:32 PM David Ahern wrote: >> >> From: David Ahern >> >> fib_check_nh is currently huge covering multiple uses cases - device only, >> device + gateway, and device + gateway with ONLINK. The next patch adds >> validation

[PATCH bpf-next] [tools/bpf] fix a few ubsan warning

2019-04-09 Thread Yonghong Song
The issue is reported at https://github.com/libbpf/libbpf/issues/28. Basically, per C standard, for void *memcpy(void *dest, const void *src, size_t n) if "dest" or "src" is NULL, regardless of whether "n" is 0 or not, the result of memcpy is undefined. clang ubsan reported three such instances

[RFC net-next v1 5/6] taprio: Add support for frame-preemption

2019-04-09 Thread Vinicius Costa Gomes
Frame preemption can be used to further reduce the latency of network communications, so some kinds of traffic can be preempted by higher priorities ones. This is a hardware only feature. Frame-preemption is in relation to transmission queues, if the nth bit of the frame-preemption mask is enabled

[RFC net-next v1 3/6] taprio: Add support for setting the cycle-time manually

2019-04-09 Thread Vinicius Costa Gomes
IEEE 802.1Q-2018 defines that a the cycle-time of a schedule may be overridden, so the schedule is truncated to a determined "width". Signed-off-by: Vinicius Costa Gomes --- include/uapi/linux/pkt_sched.h | 1 + net/sched/sch_taprio.c | 56 -- 2 files cha

[RFC net-next v1 6/6] taprio: Add support for hardware offloading

2019-04-09 Thread Vinicius Costa Gomes
This allows taprio to offload the schedule enforcement to capable network cards, resulting in more precise windows and less CPU usage. The important detail here is the difference between the gate_mask in taprio and gate_mask for the network driver. For the driver, each bit in gate_mask references

[RFC net-next v1 4/6] taprio: Add support for cycle-time-extension

2019-04-09 Thread Vinicius Costa Gomes
IEEE 802.1Q-2018 defines the concept of a cycle-time-extension, so the last entry of a schedule before the start of a new schedule can be extended, so "too-short" entries can be avoided. Signed-off-by: Vinicius Costa Gomes --- include/uapi/linux/pkt_sched.h | 1 + net/sched/sch_taprio.c

[RFC net-next v1 2/6] taprio: Add support adding an admin schedule

2019-04-09 Thread Vinicius Costa Gomes
The IEEE 802.1Q-2018 defines two "types" of schedules, the "Oper" (from operational?) and "Admin" ones. Up until now, 'taprio' only had support for the "Oper" one, added when the qdisc is created. This adds support for the "Admin" one, which allows the .change() operation to be supported. Just for

[RFC net-next v1 1/6] taprio: Fix potencial use of invalid memory during dequeue()

2019-04-09 Thread Vinicius Costa Gomes
Right now, this isn't a problem, but the next commit allows schedules to be added during runtime. When a new schedule transitions from the inactive to the active state ("admin" -> "oper") the previous one is free'd, if it's free'd just after the RCU read lock is released, we may access an invalid '

[RFC net-next v1 0/6] RFC: taprio change schedules + offload

2019-04-09 Thread Vinicius Costa Gomes
Hi, Overview This RFC has two objectives, it adds support for changing the running schedules during "runtime", explained in more detail later, and proposes an interface between taprio and the drivers for hardware offloading. These two different features are presented together so it's c

Re: [PATCH bpf-next v6 00/16] BPF support for global data

2019-04-09 Thread Alexei Starovoitov
On Tue, Apr 9, 2019 at 2:09 PM Daniel Borkmann wrote: > > This series is a major rework of previously submitted libbpf > patches [0] in order to add global data support for BPF. The > kernel has been extended to add proper infrastructure that allows > for full .bss/.data/.rodata sections on BPF lo

Re: [PATCH v2 net-next 08/18] ipv4: Refactor fib_check_nh

2019-04-09 Thread Govindarajulu Varadarajan
On Fri, Apr 5, 2019 at 4:32 PM David Ahern wrote: > > From: David Ahern > > fib_check_nh is currently huge covering multiple uses cases - device only, > device + gateway, and device + gateway with ONLINK. The next patch adds > validation checks for IPv6 which only further complicates it. So, brea

Re: [PATCH bpf 2/2] libbpf: remove dependency on barrier.h in xsk.h

2019-04-09 Thread Daniel Borkmann
On 04/10/2019 12:28 AM, Georg Müller wrote: > Am 09.04.19 um 13:29 schrieb Magnus Karlsson: >> On Tue, Apr 9, 2019 at 11:11 AM Daniel Borkmann wrote: >>> On 04/09/2019 08:44 AM, Magnus Karlsson wrote: The use of smp_rmb() and smp_wmb() creates a Linux header dependency on barrier.h that

Re: [PATCH bpf 2/2] libbpf: remove dependency on barrier.h in xsk.h

2019-04-09 Thread Georg Müller
Am 09.04.19 um 13:29 schrieb Magnus Karlsson: > On Tue, Apr 9, 2019 at 11:11 AM Daniel Borkmann wrote: >> >> On 04/09/2019 08:44 AM, Magnus Karlsson wrote: >>> The use of smp_rmb() and smp_wmb() creates a Linux header dependency >>> on barrier.h that is uneccessary in most parts. This patch implem

[RFC net-next 1/1] tdc.py: Introduce required plugins

2019-04-09 Thread Lucas Bates
Some of the testcases (for example, all of the fw tests) in tdc require activating the nsplugin. This RFC introduces a feature which tags one such test with the keyword "requires". Anyone running a test that requires nsplugin will now get a warning if they are missing the plugin. After compiling t

Re: [PATCH bpf-next] libbpf: fix crash in XDP socket part with new larger BPF_LOG_BUF_SIZE

2019-04-09 Thread Daniel Borkmann
On 04/09/2019 02:49 PM, Magnus Karlsson wrote: > In commit da11b417583e ("libbpf: teach libbpf about log_level bit 2"), > the BPF_LOG_BUF_SIZE was increased to 16M. The XDP socket part of > libbpf allocated the log_buf on the stack, but for the new 16M buffer > size this is not going to work. Chang

[PATCH net-next 03/10] ipv6: Change rt6_probe to take a fib6_nh

2019-04-09 Thread David Ahern
From: David Ahern rt6_probe sends probes for gateways in a nexthop. As such it really depends on a fib6_nh, not a fib entry. Move last_probe to fib6_nh and update rt6_probe to a fib6_nh struct. Signed-off-by: David Ahern --- include/net/ip6_fib.h | 8 net/ipv6/route.c | 16 +

[PATCH net-next 01/10] ipv6: Only call rt6_check_neigh for nexthop with gateway

2019-04-09 Thread David Ahern
From: David Ahern Change rt6_check_neigh to take a fib6_nh instead of a fib entry. Move the check on fib_flags and whether the nexthop has a gateway up to the one caller. Remove the inline from the definition as well. Not necessary. Signed-off-by: David Ahern --- net/ipv6/route.c | 16 +++

[PATCH net-next 05/10] ipv6: Refactor find_match

2019-04-09 Thread David Ahern
From: David Ahern find_match primarily needs a fib6_nh (and fib6_flags which it passes through to rt6_score_route). Move fib6_check_expired up to the call sites so find_match is only called for relevant entries. Remove the match argument which is mostly a pass through and use the return boolean t

[PATCH net-next 07/10] ipv6: Be smarter with null_entry handling in ip6_pol_route_lookup

2019-04-09 Thread David Ahern
From: David Ahern Clean up the fib6_null_entry handling in ip6_pol_route_lookup. rt6_device_match can return fib6_null_entry, but fib6_multipath_select can not. Consolidate the fib6_null_entry handling and on the final null_entry check set rt and goto out - no need to defer to a second check afte

[PATCH net-next 06/10] ipv6: Refactor find_rr_leaf

2019-04-09 Thread David Ahern
From: David Ahern find_rr_leaf has 3 loops over fib_entries calling find_match. The loops are very similar with differences in start point and whether the metric is evaluated: 1. start at rr_head, no extra loop compare, check fib metric 2. start at leaf, compare rt against rr_head, check

[PATCH net-next 08/10] ipv6: Move fib6_multipath_select down in ip6_pol_route

2019-04-09 Thread David Ahern
From: David Ahern Move the siblings and fib6_multipath_select after the null entry check since a null entry can not have siblings. Signed-off-by: David Ahern --- net/ipv6/route.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index

[PATCH net-next 04/10] ipv6: Pass fib6_nh and flags to rt6_score_route

2019-04-09 Thread David Ahern
From: David Ahern rt6_score_route only needs the fib6_flags and nexthop data. Change it accordingly. Allows re-use later for nexthop based fib6_nh. Signed-off-by: David Ahern --- net/ipv6/route.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/net/ipv6/r

[PATCH net-next 09/10] ipv6: Refactor rt6_device_match

2019-04-09 Thread David Ahern
From: David Ahern Move the device and gateway checks in the fib6_next loop to a helper that can be called per fib6_nh entry. Signed-off-by: David Ahern --- net/ipv6/route.c | 38 +- 1 file changed, 25 insertions(+), 13 deletions(-) diff --git a/net/ipv6/rou

[PATCH net-next 02/10] ipv6: Remove rt6_check_dev

2019-04-09 Thread David Ahern
From: David Ahern rt6_check_dev is a simpler helper with only 1 caller. Fold the code into rt6_score_route. Signed-off-by: David Ahern --- net/ipv6/route.c | 15 --- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index b515fa8f787

[PATCH net-next 00/10] ipv6: Refactor nexthop selection helpers during a fib lookup

2019-04-09 Thread David Ahern
From: David Ahern IPv6 has a fib6_nh embedded within each fib6_info and a separate fib6_info for each path in a multipath route. A side effect is that a fib6_info is passed all the way down the stack when selecting a path on a fib lookup. Refactor the fib lookup functions and associated helper fu

[PATCH net-next 10/10] ipv6: Refactor __ip6_route_redirect

2019-04-09 Thread David Ahern
From: David Ahern Move the nexthop evaluation of a fib entry to a helper that can be leveraged for each fib6_nh in a multipath nexthop object. In the move, 'continue' statements means the helper returns false (loop should continue) and 'break' means return true (found the entry of interest). Si

[PATCH net] selftests: fib_tests: Fix 'Command line is not complete' errors

2019-04-09 Thread David Ahern
From: David Ahern A couple of tests are verifying a route has been removed. The helper expects the prefix as the first part of the expected output. When checking that a route has been deleted the prefix is empty leading to an invalid ip command: $ ip ro ls match Command line is not complete.

[PATCH bpf-next v6 05/16] bpf: allow . char as part of the object name

2019-04-09 Thread Daniel Borkmann
Trivial addition to allow '.' aside from '_' as "special" characters in the object name. Used to allow for substrings in maps from loader side such as ".bss", ".data", ".rodata", but could also be useful for other purposes. Signed-off-by: Daniel Borkmann Acked-by: Andrii Nakryiko Acked-by: Marti

[PATCH bpf-next v6 01/16] bpf: implement lookup-free direct value access for maps

2019-04-09 Thread Daniel Borkmann
This generic extension to BPF maps allows for directly loading an address residing inside a BPF map value as a single BPF ldimm64 instruction! The idea is similar to what BPF_PSEUDO_MAP_FD does today, which is a special src_reg flag for ldimm64 instruction that indicates that inside the first part

[PATCH bpf-next v6 13/16] bpf: bpftool support for dumping data/bss/rodata sections

2019-04-09 Thread Daniel Borkmann
Add the ability to bpftool to handle BTF Var and DataSec kinds in order to dump them out of btf_dumper_type(). The value has a single object with the section name, which itself holds an array of variables it dumps. A single variable is an object by itself printed along with its name. From there fur

[PATCH bpf-next v6 02/16] bpf: do not retain flags that are not tied to map lifetime

2019-04-09 Thread Daniel Borkmann
Both BPF_F_WRONLY / BPF_F_RDONLY flags are tied to the map file descriptor, but not to the map object itself! Meaning, at map creation time BPF_F_RDONLY can be set to make the map read-only from syscall side, but this holds only for the returned fd, so any other fd either retrieved via bpf file sys

[PATCH bpf-next v6 07/16] bpf: kernel side support for BTF Var and DataSec

2019-04-09 Thread Daniel Borkmann
This work adds kernel-side verification, logging and seq_show dumping of BTF Var and DataSec kinds which are emitted with latest LLVM. The following constraints apply: BTF Var must have: - Its kind_flag is 0 - Its vlen is 0 - Must point to a valid type - Type must not resolve to a forward type -

[PATCH bpf-next v6 10/16] bpf, libbpf: refactor relocation handling

2019-04-09 Thread Daniel Borkmann
From: Joe Stringer Adjust the code for relocations slightly with no functional changes, so that upcoming patches that will introduce support for relocations into the .data, .rodata and .bss sections can be added independent of these changes. Signed-off-by: Joe Stringer Signed-off-by: Daniel Bor

[PATCH bpf-next v6 06/16] bpf: add specification for BTF Var and DataSec kinds

2019-04-09 Thread Daniel Borkmann
This adds the BTF specification and UAPI bits for supporting BTF Var and DataSec kinds. This is following LLVM upstream commit ac4082b77e07 ("[BPF] Add BTF Var and DataSec Support") which has been merged recently. Var itself is for describing a global variable and DataSec to describe ELF sections e

[PATCH bpf-next v6 11/16] bpf, libbpf: support global data/bss/rodata sections

2019-04-09 Thread Daniel Borkmann
This work adds BPF loader support for global data sections to libbpf. This allows to write BPF programs in more natural C-like way by being able to define global variables and const data. Back at LPC 2018 [0] we presented a first prototype which implemented support for global data sections by exte

[PATCH bpf-next v6 09/16] bpf: sync {btf, bpf}.h uapi header from tools infrastructure

2019-04-09 Thread Daniel Borkmann
Pull in latest changes from both headers, so we can make use of them in libbpf. Signed-off-by: Daniel Borkmann Acked-by: Martin KaFai Lau --- tools/include/uapi/linux/bpf.h | 20 ++-- tools/include/uapi/linux/btf.h | 32 2 files changed, 46 inser

[PATCH bpf-next v6 08/16] bpf: allow for key-less BTF in array map

2019-04-09 Thread Daniel Borkmann
Given we'll be reusing BPF array maps for global data/bss/rodata sections, we need a way to associate BTF DataSec type as its map value type. In usual cases we have this ugly BPF_ANNOTATE_KV_PAIR() macro hack e.g. via 38d5d3b3d5db ("bpf: Introduce BPF_ANNOTATE_KV_PAIR") to get initial map to type a

[PATCH bpf-next v6 12/16] bpf, libbpf: add support for BTF Var and DataSec

2019-04-09 Thread Daniel Borkmann
This adds libbpf support for BTF Var and DataSec kinds. Main point here is that libbpf needs to do some preparatory work before the whole BTF object can be loaded into the kernel, that is, fixing up of DataSec size taken from the ELF section size and non-static variable offset which needs to be tak

[PATCH bpf-next v6 16/16] bpf, selftest: add test cases for BTF Var and DataSec

2019-04-09 Thread Daniel Borkmann
Extend test_btf with various positive and negative tests around BTF verification of kind Var and DataSec. All passing as well: # ./test_btf [...] BTF raw test[4] (global data test #1): OK BTF raw test[5] (global data test #2): OK BTF raw test[6] (global data test #3): OK BTF raw test[7

[PATCH bpf-next v6 15/16] bpf, selftest: test global data/bss/rodata sections

2019-04-09 Thread Daniel Borkmann
From: Joe Stringer Add tests for libbpf relocation of static variable references into the .data, .rodata and .bss sections of the ELF, also add read-only test for .rodata. All passing: # ./test_progs [...] test_global_data:PASS:load program 0 nsec test_global_data:PASS:pass global data r

[PATCH bpf-next v6 03/16] bpf: add program side {rd, wr}only support for maps

2019-04-09 Thread Daniel Borkmann
This work adds two new map creation flags BPF_F_RDONLY_PROG and BPF_F_WRONLY_PROG in order to allow for read-only or write-only BPF maps from a BPF program side. Today we have BPF_F_RDONLY and BPF_F_WRONLY, but this only applies to system call side, meaning the BPF program has full read/write acce

[PATCH bpf-next v6 04/16] bpf: add syscall side map freeze support

2019-04-09 Thread Daniel Borkmann
This patch adds a new BPF_MAP_FREEZE command which allows to "freeze" the map globally as read-only / immutable from syscall side. Map permission handling has been refactored into map_get_sys_perms() and drops FMODE_CAN_WRITE in case of locked map. Main use case is to allow for setting up .rodata

[PATCH bpf-next v6 14/16] bpf, selftest: test {rd, wr}only flags and direct value access

2019-04-09 Thread Daniel Borkmann
Extend test_verifier with various test cases around the two kernel extensions, that is, {rd,wr}only map support as well as direct map value access. All passing, one skipped due to xskmap not present on test machine: # ./test_verifier [...] #948/p XDP pkt read, pkt_meta' <= pkt_data, bad acce

[PATCH bpf-next v6 00/16] BPF support for global data

2019-04-09 Thread Daniel Borkmann
This series is a major rework of previously submitted libbpf patches [0] in order to add global data support for BPF. The kernel has been extended to add proper infrastructure that allows for full .bss/.data/.rodata sections on BPF loader side based upon feedback from LPC discussions [1]. Latter su

[PATCH bpf-next v6 05/16] bpf: allow . char as part of the object name

2019-04-09 Thread Daniel Borkmann
Trivial addition to allow '.' aside from '_' as "special" characters in the object name. Used to allow for substrings in maps from loader side such as ".bss", ".data", ".rodata", but could also be useful for other purposes. Signed-off-by: Daniel Borkmann Acked-by: Andrii Nakryiko Acked-by: Marti

[PATCH bpf-next v6 04/16] bpf: add syscall side map freeze support

2019-04-09 Thread Daniel Borkmann
This patch adds a new BPF_MAP_FREEZE command which allows to "freeze" the map globally as read-only / immutable from syscall side. Map permission handling has been refactored into map_get_sys_perms() and drops FMODE_CAN_WRITE in case of locked map. Main use case is to allow for setting up .rodata

[PATCH bpf-next v6 03/16] bpf: add program side {rd, wr}only support for maps

2019-04-09 Thread Daniel Borkmann
This work adds two new map creation flags BPF_F_RDONLY_PROG and BPF_F_WRONLY_PROG in order to allow for read-only or write-only BPF maps from a BPF program side. Today we have BPF_F_RDONLY and BPF_F_WRONLY, but this only applies to system call side, meaning the BPF program has full read/write acce

[PATCH bpf-next v6 02/16] bpf: do not retain flags that are not tied to map lifetime

2019-04-09 Thread Daniel Borkmann
Both BPF_F_WRONLY / BPF_F_RDONLY flags are tied to the map file descriptor, but not to the map object itself! Meaning, at map creation time BPF_F_RDONLY can be set to make the map read-only from syscall side, but this holds only for the returned fd, so any other fd either retrieved via bpf file sys

[PATCH bpf-next v6 06/16] bpf: add specification for BTF Var and DataSec kinds

2019-04-09 Thread Daniel Borkmann
This adds the BTF specification and UAPI bits for supporting BTF Var and DataSec kinds. This is following LLVM upstream commit ac4082b77e07 ("[BPF] Add BTF Var and DataSec Support") which has been merged recently. Var itself is for describing a global variable and DataSec to describe ELF sections e

[PATCH bpf-next v6 07/16] bpf: kernel side support for BTF Var and DataSec

2019-04-09 Thread Daniel Borkmann
This work adds kernel-side verification, logging and seq_show dumping of BTF Var and DataSec kinds which are emitted with latest LLVM. The following constraints apply: BTF Var must have: - Its kind_flag is 0 - Its vlen is 0 - Must point to a valid type - Type must not resolve to a forward type -

[PATCH bpf-next v6 01/16] bpf: implement lookup-free direct value access for maps

2019-04-09 Thread Daniel Borkmann
This generic extension to BPF maps allows for directly loading an address residing inside a BPF map value as a single BPF ldimm64 instruction! The idea is similar to what BPF_PSEUDO_MAP_FD does today, which is a special src_reg flag for ldimm64 instruction that indicates that inside the first part

[PATCH bpf-next v6 00/16] BPF support for global data

2019-04-09 Thread Daniel Borkmann
This series is a major rework of previously submitted libbpf patches [0] in order to add global data support for BPF. The kernel has been extended to add proper infrastructure that allows for full .bss/.data/.rodata sections on BPF loader side based upon feedback from LPC discussions [1]. Latter su

Re: [PATCH net-next] net: phy: switch drivers to use dynamic feature detection

2019-04-09 Thread David Miller
From: Heiner Kallweit Date: Sun, 7 Apr 2019 11:57:13 +0200 > Recently genphy_read_abilities() has been added that dynamically detects > clause 22 PHY abilities. I *think* this detection should work with all > supported PHY's, at least for the ones with basic features sets, i.e. > PHY_BASIC_FEATUR

Re: [PATCH bpf-next v4 1/3] bpf: support input __sk_buff context in BPF_PROG_TEST_RUN

2019-04-09 Thread Martin Lau
On Tue, Apr 09, 2019 at 11:49:09AM -0700, Stanislav Fomichev wrote: > Add new set of arguments to bpf_attr for BPF_PROG_TEST_RUN: > * ctx_in/ctx_size_in - input context > * ctx_out/ctx_size_out - output context Acked-by: Martin KaFai Lau

[PATCH] ip6_tunnel: Match to ARPHRD_TUNNEL6 for dev type

2019-04-09 Thread Sheena Mira-ato
The device type for ip6 tunnels is set to ARPHRD_TUNNEL6. However, the ip4ip6_err function is expecting the device type of the tunnel to be ARPHRD_TUNNEL. Since the device types do not match, the function exits and the ICMP error packet is not sent to the originating host. Note that the device typ

Re: [PATCH net-next] net: phy: remove unnecessary callback settings in C45 drivers

2019-04-09 Thread David Miller
From: Heiner Kallweit Date: Sun, 7 Apr 2019 12:11:35 +0200 > genphy_c45_aneg_done() is used by phylib as fallback for c45 PHY's if > callback aneg_done isn't defined. So we don't have to set this > explicitly. Same for genphy_c45_pma_read_abilities(). > > Signed-off-by: Heiner Kallweit This pa

Re: [PATCH v3 1/2] net: phy: mscc: add support for VSC8514 PHY

2019-04-09 Thread Heiner Kallweit
On 08.04.2019 14:12, kavyasree.kotag...@microchip.com wrote: > From: Kavya Sree Kotagiri > > The VSC8514 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX, > 1000BASE-X, can communicate with the MAC via QSGMII. > The MAC interface protocol for each port within QSGMII can > be either 1000

Re: [PATCH v3 1/2] net: phy: mscc: add support for VSC8514 PHY

2019-04-09 Thread Heiner Kallweit
On 08.04.2019 14:12, kavyasree.kotag...@microchip.com wrote: > From: Kavya Sree Kotagiri > > The VSC8514 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX, > 1000BASE-X, can communicate with the MAC via QSGMII. > The MAC interface protocol for each port within QSGMII can > be either 1000

[PATCH] ip: xfrm if_id -ve value is error

2019-04-09 Thread Antony Antony
if_id is u32, error on -ve values instead of setting to 0 after : ip link add ipsec0 type xfrm dev enp0s5 if_id -10 Error: argument "-10" is wrong: if_id value is invalid before : ip link add ipsec0 type xfrm dev enp0s5 if_id -10 ip -d link show dev ipsec0 67: ipsec0@enp0s5: mtu 1500 qdisc noop

[net 09/10] Revert "net/mlx5e: Enable reporting checksum unnecessary also for L3 packets"

2019-04-09 Thread Saeed Mahameed
From: Or Gerlitz This reverts commit b820e6fb0978f9c2ac438c199d2bb2f35950e9c9. Prior the commit we are reverting, checksum unnecessary was only set when both the L3 OK and L4 OK bits are set on the CQE. This caused packets of IP protocols such as SCTP which are not dealt by the current HW L4 par

[net 10/10] net/mlx5e: Switch to Toeplitz RSS hash by default

2019-04-09 Thread Saeed Mahameed
From: Konstantin Khlebnikov Although XOR hash function can perform very well on some special use cases, to align with all drivers, mlx5 driver should use Toeplitz hash by default. Toeplitz is more stable for the general use case and it is more standard and reliable. On top of that, since XOR (ML

[net 08/10] net/mlx5e: Protect against non-uplink representor for encap

2019-04-09 Thread Saeed Mahameed
From: Dmytro Linkin TC encap offload is supported only for the physical uplink representor. Fail for non uplink representor. Fixes: 3e621b19b0bb ("net/mlx5e: Support TC encapsulation offloads with upper devices") Signed-off-by: Dmytro Linkin Reviewed-by: Eli Britstein Reviewed-by: Vlad Buslov

[net 06/10] net/mlx5e: Rx, Fixup skb checksum for packets with tail padding

2019-04-09 Thread Saeed Mahameed
When an ethernet frame with ip payload is padded, the padding octets are not covered by the hardware checksum. Prior to the cited commit, skb checksum was forced to be CHECKSUM_NONE when padding is detected. After it, the kernel will try to trim the padding bytes and subtract their checksum from s

[net 07/10] net/mlx5e: Rx, Check ip headers sanity

2019-04-09 Thread Saeed Mahameed
In the two places is_last_ethertype_ip is being called, the caller will be looking inside the ip header, to be safe, add ip{4,6} header sanity check. And return true only on valid ip headers, i.e: the whole header is contained in the linear part of the skb. Note: Such situation is very rare and ha

[net 05/10] net/mlx5e: XDP, Avoid checksum complete when XDP prog is loaded

2019-04-09 Thread Saeed Mahameed
XDP programs might change packets data contents which will make the reported skb checksum (checksum complete) invalid. When XDP programs are loaded/unloaded set/clear rx RQs MLX5E_RQ_STATE_NO_CSUM_COMPLETE flag. Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") Reviewed-by:

[net 04/10] net/mlx5e: Use fail-safe channels reopen in tx reporter recover

2019-04-09 Thread Saeed Mahameed
From: Eran Ben Elisha When requested to recover from error, the tx reporter might open new channels and close the existing ones. Use safe channels switch flow in order to guarantee opened channels at the end of the recover flow. For this purpose, define mlx5e_safe_reopen_channels function and use

[net 03/10] net/mlx5e: Skip un-needed tx recover if interface state is down

2019-04-09 Thread Saeed Mahameed
From: Eran Ben Elisha Skip recover operation if interface is in down state as TX objects are not open. This fixes a bug were the recover flow re-opened TX objects which were not opened before, leading to a possible memory leak at driver unload. Fixes: de8650a82071 ("net/mlx5e: Add tx reporter su

[pull request][net 00/10] Mellanox, mlx5 fixes 2019-04-09

2019-04-09 Thread Saeed Mahameed
Hi Dave, This series provides some fixes to mlx5 driver. I've cc'ed some of the checksum fixes to Eric Dumazet and i would like to get his feedback before you pull. For -stable v4.19 ('net/mlx5: FPGA, tls, idr remove on flow delete') ('net/mlx5: FPGA, tls, hold rcu read lock a bit longer') For

[net 02/10] net/mlx5: FPGA, tls, idr remove on flow delete

2019-04-09 Thread Saeed Mahameed
Flow is kfreed on mlx5_fpga_tls_del_flow but kept in the idr data structure, this is risky and can cause use-after-free, since the idr_remove is delayed until tls_send_teardown_cmd completion. Instead of delaying idr_remove, in this patch we do it on mlx5_fpga_tls_del_flow, before actually kfree(f

[net 01/10] net/mlx5: FPGA, tls, hold rcu read lock a bit longer

2019-04-09 Thread Saeed Mahameed
To avoid use-after-free, hold the rcu read lock until we are done copying flow data into the command buffer. Fixes: ab412e1dd7db ("net/mlx5: Accel, add TLS rx offload routines") Reported-by: Eric Dumazet Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/fpga/tls.c | 18 +

Re: [PATCH bpf-next v5 04/16] bpf: add syscall side map freeze support

2019-04-09 Thread Daniel Borkmann
On 04/09/2019 08:46 PM, Jann Horn wrote: > On Mon, Apr 8, 2019 at 3:54 PM Daniel Borkmann wrote: >> This patch adds a new BPF_MAP_FREEZE command which allows to >> "freeze" the map globally as read-only / immutable from syscall >> side. >> >> Map permission handling has been refactored into map_ge

[PATCH iproute2-next] ip fou: Support binding FOU ports

2019-04-09 Thread Kristian Evensen
This patch adds support for binding FOU ports using iproute2. Kernel-support was added in 1713cb37bf67 ("fou: Support binding FoU socket"). The parse function now handles new arguments for setting the binding-related attributes, while the print function writes the new attributes if they are set. A

[PATCH v2 net-next 5/6] ip6tlvs: Add netlink interface

2019-04-09 Thread Tom Herbert
Add a netlink interface to manage the TX TLV parameters. Managed parameters include those for validating and sending TLVs being sent such as alignment, TLV ordering, length limits, etc. --- include/net/ipv6.h | 9 +- include/uapi/linux/in6.h | 31 net/ipv6/exthdrs_core.c| 362

[PATCH v2 net-next 6/6] ip6tlvs: Validation of TX Destination and Hop-by-Hop options

2019-04-09 Thread Tom Herbert
Validate Destination and Hop-by-Hop options. This uses the information in the TLV parameters table to validate various aspects of both individual TLVs as well as a list of TLVs in an extension header. There are two levels of validation that can be performed: simple checks and deep checks. Simple c

[PATCH v2 net-next 2/6] exthdrs: Move generic EH functions to exthdrs_core.c

2019-04-09 Thread Tom Herbert
Move generic functions in exthdrs.c to exthdrs_core.c so that exthdrs.c only contains functions that are specific to IPv6 processing, and exthdrs_core.c contains functions that are generic. --- net/ipv6/exthdrs.c | 138 --- net/ipv6/exthdrs_core.c |

[PATCH v2 net-next 3/6] exthdrs: Registration of TLV handlers and parameters

2019-04-09 Thread Tom Herbert
Create a single TLV parameter table that holds meta information for IPv6 Hop-by-Hop and Destination TLVs. The data structure is composed of a 256 element array of u8's (one entry for each TLV type to allow O(1) lookup). Each entry provides an offset into an array of TLV proc data structures which f

[PATCH v2 net-next 4/6] exthdrs: Add TX parameters

2019-04-09 Thread Tom Herbert
Define a number of transmit parameters for TLV Parameter table definitions. These will be used for validating TLVs that are set on a socket. --- include/net/ipv6.h | 26 - include/uapi/linux/in6.h | 8 +++ net/ipv6/exthdrs.c | 2 +- net/ipv6/exthdrs_core

[PATCH v2 net-next 1/6] exthdrs: Create exthdrs_options.c

2019-04-09 Thread Tom Herbert
Create exthdrs_options.c to hold code related to specific Hop-by-Hop and Destination extension header options. Move related functions in exthdrs.c to the new file. --- include/net/ipv6.h | 15 net/ipv6/Makefile | 2 +- net/ipv6/exthdrs.c | 204 -

[PATCH v2 net-next 0/6] exthdrs: Make ext. headers & options useful - Part I

2019-04-09 Thread Tom Herbert
Extension headers are the mechanism of extensibility for the IPv6 protocol, however to date they have only seen limited deployment. The reasons for that are because intermediate devices don't handle them well, and there haven't really be any useful extension headers defined. In particular, Destinat

[PATCH bpf-next v4 2/3] libbpf: add support for ctx_{size,}_{in,out} in BPF_PROG_TEST_RUN

2019-04-09 Thread Stanislav Fomichev
Support recently introduced input/output context for test runs. We extend only bpf_prog_test_run_xattr. bpf_prog_test_run is unextendable and left as is. Cc: Martin Lau Acked-by: Martin KaFai Lau Signed-off-by: Stanislav Fomichev --- tools/include/uapi/linux/bpf.h | 7 +++ tools/lib/bpf/bp

[PATCH bpf-next v4 1/3] bpf: support input __sk_buff context in BPF_PROG_TEST_RUN

2019-04-09 Thread Stanislav Fomichev
Add new set of arguments to bpf_attr for BPF_PROG_TEST_RUN: * ctx_in/ctx_size_in - input context * ctx_out/ctx_size_out - output context The intended use case is to pass some meta data to the test runs that operate on skb (this has being brought up on recent LPC). For programs that use bpf_prog_t

[PATCH bpf-next v4 3/3] selftests: bpf: add selftest for __sk_buff context in BPF_PROG_TEST_RUN

2019-04-09 Thread Stanislav Fomichev
Simple test that sets cb to {1,2,3,4,5} and priority to 6, runs bpf program that fails if cb is not what we expect and increments cb[i] and priority. When the test finishes, we check that cb is now {2,3,4,5,6} and priority is 7. We also test the sanity checks: * ctx_in is provided, but ctx_size_in

Re: [PATCH bpf-next v5 04/16] bpf: add syscall side map freeze support

2019-04-09 Thread Jann Horn
On Mon, Apr 8, 2019 at 3:54 PM Daniel Borkmann wrote: > This patch adds a new BPF_MAP_FREEZE command which allows to > "freeze" the map globally as read-only / immutable from syscall > side. > > Map permission handling has been refactored into map_get_sys_perms() > and drops FMODE_CAN_WRITE in cas

Re: [PATCH v2 0/4] ethtool: add support for new PHY tunable Fast Link Down

2019-04-09 Thread John W. Linville
On Fri, Mar 29, 2019 at 08:13:56PM +0100, Heiner Kallweit wrote: > This series adds support for Fast Link Down as new PHY tunable. > See [0] for the kernel part incl. the Marvell PHY driver as > first user. > > [0] https://marc.info/?t=15535390081&r=1&w=2 > > v2: > - improve man page wording

Re: [PATCH net v8] failover: allow name change on IFF_UP slave interfaces

2019-04-09 Thread si-wei liu
On 4/9/2019 9:13 AM, Michael S. Tsirkin wrote: On Mon, Apr 08, 2019 at 07:45:27PM -0400, Si-Wei Liu wrote: When a netdev appears through hot plug then gets enslaved by a failover master that is already up and running, the slave will be opened right away after getting enslaved. Today there's a

Re: Add a 'start N' option when specifying the Rx flow hash indirection table.

2019-04-09 Thread John W. Linville
On Mon, Apr 01, 2019 at 11:42:45AM -0700, Jonathan Lemon wrote: > When using more than one RSS table, specifying a starting queue for flow > distibution > makes it easier to specify the set of queues attached to the table. An > example: > > ethtool -X eth0 context 0 equal 14

Re: [PATCH bpf-next v3 1/3] bpf: support input __sk_buff context in BPF_PROG_TEST_RUN

2019-04-09 Thread Stanislav Fomichev
On 04/09, Martin Lau wrote: > On Tue, Apr 09, 2019 at 10:27:37AM -0700, Stanislav Fomichev wrote: > > Add new set of arguments to bpf_attr for BPF_PROG_TEST_RUN: > > * ctx_in/ctx_size_in - input context > > * ctx_out/ctx_size_out - output context > > > > The intended use case is to pass some meta

  1   2   >