seeking community input on adapting DPDK to P4Runtime backend

Zhang, Qi Z Sun, 07 May 2023 23:40:17 -0700

Hi:

Our team is currently working on developing a DPDK PMD for a P4-programmed 
network controller, based on customer feedback to integrate DPDK into the 
P4Runtime backend .[https://p4.org/p4-spec/p4runtime/main/P4Runtime-Spec.html]


(*) However, we are facing challenges in adapting DPDK's rte_flow API to the 
P4Runtime API, primarily due to the transition from a table-based API with 
fields of arbitrary bits width at arbitrary offset to a protocol-based API 
(more detail be described in post-script). 

We are seeking suggestions and best practices from the open-source community to 
help us with this integration. Specifically, we are interested in learning:

(*) If anyone has previously attempted to map rte_flow to P4-based devices.
(*) Thoughts on how to map from table-based matching to protocol-based matching 
like in rte_flow.
(*) Any ideas on how to extend or expand the rte_flow APIs to better 
accommodate P4-based or other table-matching based devices. 

Your insights and feedback would be greatly appreciated!        

======================= Post-Script ============================

More details on the problem below, for anyone interested

In P4, flow offloading can be implemented using the P4Runtime API, which 
provides a standard interface for controlling and configuring the data plane 
behavior of network devices. P4Runtime allows network operators to dynamically 
add, modify, and remove flow rules in the hardware forwarding tables of 
P4-enabled devices.

The P4Runtime API is a table-based API, it assume the packet process pipeline 
was consists of one or more key/action units (tables). In P4Runtime, each table 
defines the fields to be matched and the actions to be taken on incoming 
packets. During compilation, the P4 compiler assigns a unique uint32 ID to each 
table, action, and field, which is associated with its corresponding string 
name. These IDs have no inherent relationship to any network protocol but 
instead serve as a means to identify different components of a P4 program 
within the P4Runtime API. 

If we choose to use rte_flow as the low-level API for P4Runtime, a translation 
layer is needed in the application to map the P4 tables and actions to the 
corresponding rte_flow rules. However, this translation layer can be 
problematic as it is not easily scalable. When the P4 pipeline is refined or 
updated, the translation rules may also need to be updated, which can result in 
errors and reduced efficiency.

On the other hand, a hardware vendor that provides a P4-enabled device is 
required to implement an rte_flow interface in their DPDK PMD. Typically, the 
P4 compiler generates hints for the driver on how to map P4 tables to hardware 
resources, and how to convert table entry add/modify/delete actions into 
low-level hardware configurations. However, because rte_flow is protocol-based, 
it poses an additional challenge for driver developers, who must create another 
translation layer to convert rte_flow tokens into P4 object identifiers. This 
translation layer must be carefully designed and implemented to ensure optimal 
performance and scalability, and to ensure that the driver can efficiently 
handle the dynamic nature of P4 programs.

To better understand the problem, let's consider the following example that 
demonstrates how to use the P4Runtime API to program a rule for processing a 
VXLAN packet. The rule matches a VXLAN packet, decapsulates the tunnel header, 
and forwards it to a specific port.

The P4 source code below describes the VXLAN decap table decap_vxlan_tcp_table, 
which matches the outer IP address, VNI, inner IP address, and inner TCP port. 
For each rule, four action specifications can be selected. We will focus on one 
action specification decap_vxlan_fwd that performs decapsulation and forwards 
the packet to a specific port.

table decap_vxlan_tcp_table {
    key = {
        hdrs.ipv4[meta.depth-1].src_ip: exact @name("tun_ip_src");
        hdrs.ipv4[meta.depth-1].dst_ip: exact @name("tun_ip_dst");
        hdrs.vxlan[meta.depth-1].vni  : exact @name("vni");
        hdrs.ipv4[meta.depth].src_ip  : exact @name("ipv4_src");
        hdrs.ipv4[meta.depth].dst_ip  : exact @name("ipv4_dst");
        hdrs.tcp.sport                : exact @name("src_port");
        hdrs.tcp.dport                : exact @name("dst_port");
    }
    actions = {
        @tableonly decap_vxlan_fwd;
        @tableonly decap_vxlan_dnat_fwd;
        @tableonly decap_vxlan_snat_fwd;
        @defaultonly set_exception;
    }    
}
...

action decap_vxlan_fwd(PortId_t port_id) {
    meta.mod_action = (bit<11>)VXLAN_DECAP_OUTER_IPV4;
    send_to_port(port_id);
}

Below is an example of the hint that the compiler will generate for the 
decap_vxlan_tcp_table:

Table ID:      8454144
Name:          decap_vxlan_tcp_table
Field ID       Name                          Match Type     Bit Width      Byte 
Width     Byte Order
1              tun_ip_src                    exact          32             4    
          network   
2              tun_ip_dst                    exact          32             4    
          network   
3              vni                           exact          24             3    
          network   
4              ipv4_src                      exact          32             4    
          network   
5              ipv4_dst                      exact          32             4    
          network   
6              src_port                      exact          16             2    
          network   
7              dst_port                      exact          16             2    
          network   
Spec ID        Name                          
8519716        decap_vxlan_fwd
8519718        decap_vxlan_dnat_fwd
8519720        decap_vxlan_snat_fwd
8519695        set_exception

And the hint of action spec "decap_vxlan_fwd" as below:

Spec ID:       8519716
Name:          decap_vxlan_fwd
Field ID       Name                          Bit Width      Byte Width     Byte 
Order     
1              port_id                       32             4              host 
          

Please note that different compilers may assign different IDs.

Below is an example of how to program a rule using the P4 runtime API in JSON 
format. This rule matches fields and directs packets to port 5.

{
    "type": 1,  //INSERT
    "entity": {
        "table_entry": {
            "table_id": 8454144,
            "match": [
                { "field_id": 1, "exact": { "value": [10, 0, 0, 1] } },   // 
outer src IP = 10.0.0.1
                { "field_id": 2, "exact": { "value": [10, 0, 0, 2] } },  // 
outer dst IP = 10.0.0.2
                { "field_id": 3, "exact": { "value": [0, 0, 10] } },  //  vni = 
10,
                { "field_id": 4, "exact": { "value": [192, 0, 0, 1] } }, // 
inner src IP = 192.0.0.1
                {"field_id": 5, "exact": { "value": [192, 0, 0, 2] } }, // 
inner dst IP = 192.0.0.2
                {"field_id": 6, "exact": { "value": [0, 200] } }, // tcp src 
port = 200
                {"field_id": 7, "exact": { "value": [0, 201] } }, // tcp dst 
port = 201
            ],
            "action": {
                "action": {
                    "action_id": 8519716,
                    "params": [
                        { "param_id": 1, "value": [5, 0, 0, 0] }
                    ]
                }
            },
            ...
        }
    }    ...
}

Please note that this is only a part of the full command. For more information, 
please refer to the p4runtime.proto[2]

1. https://p4.org/p4-spec/p4runtime/main/P4Runtime-Spec.html
2. https://github.com/p4lang/p4runtime/blob/main/proto/p4/v1/p4runtime.proto

Thank you for your attention to this matter.

Regards
Qi

seeking community input on adapting DPDK to P4Runtime backend

Reply via email to