Mon, May 20, 2019 at 06:04:05PM CEST, step...@networkplumber.org wrote:
>On Mon, 20 May 2019 11:11:05 +0200
>Jiri Pirko <j...@resnulli.us> wrote:
>
>> Sun, May 19, 2019 at 05:10:46AM CEST, step...@networkplumber.org wrote:
>> >When a device is stacked like (team, bonding, failsafe or netvsc) the
>> >XDP generic program for the parent device is not called.  In these
>> >cases, the rx handler changes skb->dev to its own in the receive
>> >handler, and returns RX_HANDLER_ANOTHER.  Fix this by calling
>> >do_xdp_generic if necessary before starting another round.
>> >
>> >Review of all the places RX_HANDLER_ANOTHER is returned
>> >show that the current devices do correctly change skb->dev.
>> >
>> >There was an older patch that got abandoned that did the
>> >same thing, this is just a rewrite.
>> >
>> >Suggested-by: Jason Wang <jasow...@redhat.com>
>> >Fixes: d445516966dc ("net: xdp: support xdp generic on virtual devices")
>> >Signed-off-by: Stephen Hemminger <sthem...@microsoft.com>
>> >Acked-by: Jason Wang <jasow...@redhat.com>
>> >---
>> > net/core/dev.c | 10 ++++++++++
>> > 1 file changed, 10 insertions(+)
>> >
>> >diff --git a/net/core/dev.c b/net/core/dev.c
>> >index b6b8505cfb3e..240d0b2de1a8 100644
>> >--- a/net/core/dev.c
>> >+++ b/net/core/dev.c
>> >@@ -4921,6 +4921,16 @@ static int __netif_receive_skb_core(struct sk_buff 
>> >*skb, bool pfmemalloc,
>> >                    ret = NET_RX_SUCCESS;
>> >                    goto out;
>> >            case RX_HANDLER_ANOTHER:
>> >+                   if (static_branch_unlikely(&generic_xdp_needed_key)) {
>> >+                           struct bpf_prog *xdp_prog;
>> >+
>> >+                           xdp_prog = rcu_dereference(skb->dev->xdp_prog);
>> >+                           ret = do_xdp_generic(xdp_prog, skb);
>> >+                           if (ret != XDP_PASS) {
>> >+                                   ret = NET_RX_SUCCESS;
>> >+                                   goto out;
>> >+                           }
>> >+                   }  
>> 
>> I'm always scarred of changes like this. The history tells us that this
>> codepaths are very fragile. It took us non-trivial efford to fix bonding
>> here, not to mention vlans (that was pain).
>> 
>> The reason for troubles was often fact that different flows were treated
>> differently (vlan accel/non-accel).
>> 
>> This patch calls do_xdp_generic for master device in different point in
>> the receive patch comparing to lower device. Would it be possible to
>> unify this? E.g. by moving do_xdp_generice() call from
>> netif_rx_internal()/netif_receive_skb_internal() here,
>> to the beginning of __netif_receive_skb_core()?
>> 
>
>I am trying that now. But one problem is that it would break the case
>where XDP was being run on one leg of a bridge. For example if eth1 is
>part of br0; then it would no longer be possible to run XDP on eth1.

I don't see why not. The xdp is still run in __netif_receive_skb_core()
before goto another_round.

I was thinking about patch similar to this:

diff --git a/net/core/dev.c b/net/core/dev.c
index b6b8505cfb3e..4c3fdda85544 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4502,23 +4502,6 @@ static int netif_rx_internal(struct sk_buff *skb)
 
        trace_netif_rx(skb);
 
-       if (static_branch_unlikely(&generic_xdp_needed_key)) {
-               int ret;
-
-               preempt_disable();
-               rcu_read_lock();
-               ret = do_xdp_generic(rcu_dereference(skb->dev->xdp_prog), skb);
-               rcu_read_unlock();
-               preempt_enable();
-
-               /* Consider XDP consuming the packet a success from
-                * the netdev point of view we do not want to count
-                * this as an error.
-                */
-               if (ret != XDP_PASS)
-                       return NET_RX_SUCCESS;
-       }
-
 #ifdef CONFIG_RPS
        if (static_branch_unlikely(&rps_needed)) {
                struct rps_dev_flow voidflow, *rflow = &voidflow;
@@ -4858,6 +4841,19 @@ static int __netif_receive_skb_core(struct sk_buff *skb, 
bool pfmemalloc,
 
        __this_cpu_inc(softnet_data.processed);
 
+       if (static_branch_unlikely(&generic_xdp_needed_key)) {
+               int ret2;
+
+               preempt_disable();
+               rcu_read_lock();
+               ret2 = do_xdp_generic(rcu_dereference(skb->dev->xdp_prog), skb);
+               rcu_read_unlock();
+               preempt_enable();
+
+               if (ret2 != XDP_PASS)
+                       return NET_RX_DROP;
+       }
+
        if (skb->protocol == cpu_to_be16(ETH_P_8021Q) ||
            skb->protocol == cpu_to_be16(ETH_P_8021AD)) {
                skb = skb_vlan_untag(skb);
@@ -5178,19 +5174,6 @@ static int netif_receive_skb_internal(struct sk_buff 
*skb)
        if (skb_defer_rx_timestamp(skb))
                return NET_RX_SUCCESS;
 
-       if (static_branch_unlikely(&generic_xdp_needed_key)) {
-               int ret;
-
-               preempt_disable();
-               rcu_read_lock();
-               ret = do_xdp_generic(rcu_dereference(skb->dev->xdp_prog), skb);
-               rcu_read_unlock();
-               preempt_enable();
-
-               if (ret != XDP_PASS)
-                       return NET_RX_DROP;
-       }
-
        rcu_read_lock();
 #ifdef CONFIG_RPS
        if (static_branch_unlikely(&rps_needed)) {
@@ -5224,21 +5207,6 @@ static void netif_receive_skb_list_internal(struct 
list_head *head)
        }
        list_splice_init(&sublist, head);
 
-       if (static_branch_unlikely(&generic_xdp_needed_key)) {
-               preempt_disable();
-               rcu_read_lock();
-               list_for_each_entry_safe(skb, next, head, list) {
-                       xdp_prog = rcu_dereference(skb->dev->xdp_prog);
-                       skb_list_del_init(skb);
-                       if (do_xdp_generic(xdp_prog, skb) == XDP_PASS)
-                               list_add_tail(&skb->list, &sublist);
-               }
-               rcu_read_unlock();
-               preempt_enable();
-               /* Put passed packets back on main list */
-               list_splice_init(&sublist, head);
-       }
-
        rcu_read_lock();
 #ifdef CONFIG_RPS
        if (static_branch_unlikely(&rps_needed)) {

Reply via email to