From: Jiri Pirko <j...@mellanox.com>

ASICs implement shared buffer for packet forwarding purposes and enable
flexible partitioning of the shared buffer for different flows and ports,
enabling non-blocking progress of different flows as well as separation
of lossy traffic from loss-less traffic when using Per-Priority Flow
Control (PFC). The shared buffer optimizes the buffer utilization for better
absorption of packet bursts.

This patchset implements API which is based on the model SAI uses. That is
aligned with multiple ASIC vendors so this API should be vendor neutral.

Userspace counterpart patchset for devlink iproute2 tool can be found here:
https://github.com/jpirko/iproute2_mlxsw/tree/devlink_sb

Couple of examples of usage:

switch$ devlink sb help
Usage: devlink sb show [ DEV [ sb SB_INDEX ] ]
       devlink sb pool show [ DEV [ sb SB_INDEX ] pool POOL_INDEX ]
       devlink sb pool set DEV [ sb SB_INDEX ] pool POOL_INDEX
                           size POOL_SIZE thtype { static | dynamic }
       devlink sb port pool show [ DEV/PORT_INDEX [ sb SB_INDEX ]
                                   pool POOL_INDEX ]
       devlink sb port pool set DEV/PORT_INDEX [ sb SB_INDEX ]
                                pool POOL_INDEX th THRESHOLD
       devlink sb tc bind show [ DEV/PORT_INDEX [ sb SB_INDEX ] tc TC_INDEX ]
       devlink sb tc bind set DEV/PORT_INDEX [ sb SB_INDEX ] tc TC_INDEX
                              type { ingress | egress } pool POOL_INDEX
                              th THRESHOLD
       devlink sb occupancy show { DEV | DEV/PORT_INDEX } [ sb SB_INDEX ]
       devlink sb occupancy snapshot DEV [ sb SB_INDEX ]
       devlink sb occupancy clearmax DEV [ sb SB_INDEX ]

# list available share buffers
switch$ devlink sb show
pci/0000:03:00.0: sb 0 size 16777216 ing_pools 4 eg_pools 4 ing_tcs 8 eg_tcs 8

# list available pools and their config
switch$ devlink sb pool show
pci/0000:03:00.0: sb 0 pool 0 type ingress size 12400032 thtype dynamic
pci/0000:03:00.0: sb 0 pool 1 type ingress size 0 thtype dynamic
pci/0000:03:00.0: sb 0 pool 2 type ingress size 0 thtype dynamic
pci/0000:03:00.0: sb 0 pool 3 type ingress size 200064 thtype dynamic
pci/0000:03:00.0: sb 0 pool 4 type egress size 13220064 thtype dynamic
pci/0000:03:00.0: sb 0 pool 5 type egress size 0 thtype dynamic
pci/0000:03:00.0: sb 0 pool 6 type egress size 0 thtype dynamic
pci/0000:03:00.0: sb 0 pool 7 type egress size 0 thtype dynamic

# show port-pool setup for port sw0p7
switch$ devlink sb port pool show sw0p7 pool 0
sw0p7: sb 0 pool 0 threshold 16

# change threshold for port sw0p7
switch$ sudo devlink sb port pool set sw0p7 pool 0 th 15

# show port-pool changed setup for port sw0p7
switch$ devlink sb port pool show sw0p7 pool 0
sw0p7: sb 0 pool 0 threshold 15

# show TC binding setup for port sw0p7 ingress TC 0
switch$ devlink sb tc bind show sw0p7 tc 0 type ingress
sw0p7: sb 0 tc 0 type ingress pool 0 threshold 10

# change threshold TC binding setup for port sw0p7 ingress TC 0
switch$ sudo devlink sb tc bind set sw0p7 tc 0 type ingress pool 0 th 9

# show TC binding changed setup for port sw0p7 ingress TC 0
switch$ devlink sb tc bind show sw0p7 tc 0 type ingress
sw0p7: sb 0 tc 0 type ingress pool 0 threshold 9

# make a snapshot of occupancy of shared buffer for device pci/0000:03:00.0
switch$ sudo devlink sb occupancy snapshot pci/0000:03:00.0

# show occupancy for port sw0p7 from the snapshot (current/watermark)
switch$ devlink sb occupancy show sw0p7
sw0p7:
  pool: 0:      82944/3217344 1:          0/0       2:          0/0       3:    
      0/0      
        4:          0/384     5:          0/0       6:          0/0       7:    
      0/0      
  itc:  0(0):   96768/3217344 1(0):       0/0       2(0):       0/0       3(0): 
      0/0      
        4(0):       0/0       5(0):       0/0       6(0):       0/0       7(0): 
      0/0      
  etc:  0(4):       0/384     1(4):       0/0       2(4):       0/0       3(4): 
      0/0      
        4(4):       0/0       5(4):       0/0       6(4):       0/0       7(4): 
      0/0

# clear watermarks for shared buffer of device pci/0000:03:00.0
switch$ sudo devlink sb occupancy clearmax pci/0000:03:00.0

Jiri Pirko (18):
  devlink: add shared buffer configuration
  devlink: implement shared buffer occupancy monitoring interface
  mlxsw: core: Add devlink shared buffer callbacks
  mlxsw: spectrum_buffers: Push out shared buffer register writes
  mlxsw: spectrum_buffers: Push out indexes and direction out of SB
    structs
  mlxsw: spectrum_buffers: Rename "pool" to "pr" in initialization
  mlxsw: spectrum_buffers: Cache shared buffer configuration
  mlxsw: spectrum_buffers: Remove eg pool 3 default init and CPU port TC
    binding to it
  mlxsw: spectrum_buffers: Change initialization of PG 9
  mlxsw: spectrum_buffers: Get max_buff defaults into limits exposed to
    user
  mlxsw: core: Add mlxsw_core_port_driver_priv helper
  mlxsw: spectrum_buffers: Implement shared buffer configuration
  mlxsw: core: Add devlink shared buffer occupancy callbacks
  mlxsw: reg: Add Shared Buffer Status register definition
  mlxsw: reg: Extend SBPM register for occupancy control
  mlxsw: core: Add mlxsw specific workqueue and use it for FDB notif.
    processing
  mlxsw: core: Introduce support for asynchronous EMAD register access
  mlxsw: spectrum_buffers: Implement occupancy monitoring

 drivers/net/ethernet/mellanox/mlxsw/core.c         |  682 ++++++++---
 drivers/net/ethernet/mellanox/mlxsw/core.h         |   56 +
 drivers/net/ethernet/mellanox/mlxsw/reg.h          |  135 ++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     |   32 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |   68 ++
 .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c |  974 +++++++++++----
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.c   |    4 +-
 include/net/devlink.h                              |   59 +
 include/uapi/linux/devlink.h                       |   63 +
 net/core/devlink.c                                 | 1236 ++++++++++++++++++--
 10 files changed, 2787 insertions(+), 522 deletions(-)

-- 
2.5.5

Reply via email to