Skip to content

Resource Management

Amit Cohen edited this page Feb 23, 2022 · 12 revisions

Many of the ASIC's internal resources are limited and are shared between several hardware procedures. In many cases the user can provide a partitioning scheme for such a resource in order to perform fine tuning for his application. In such cases performing driver reload is needed for the changes to take place.

Such an abstraction can be coupled with devlink's dpipe interface, which models the ASIC's pipeline as a graph of match/action tables. By modeling the hardware resource object, and by coupling it with several dpipe tables, further visibility can be achieved in order to debug ASIC-wide issues.

Table of Contents
  1. Spectrum's Resources
  2. Spectrum-2's Resources
  3. Resource
    1. Resource Parameters
    2. TCAM
  4. Dpipe Resource Interaction
  5. Suggested Spectrum Resource Profiles
  6. Resource Monitoring

Spectrum's Resources

The main database in the Spectrum ASIC is a centralized KVD database which is shared between different entries such as FDB, routes, neighbours and more. The KVD is divided into several regions which are the linear, hash single and hash double parts.

                         +---------------+
                         |  KVD  memory  |
                         |               |
                         |  Size: 252K   |
                         +---------------+
                        /        |        \
                       /         |         \
                      /          |          \
                     /           |           \
                    /            |            \
         +--------------+ +--------------+  +--------------+
         |    Linear    | | Hash single  |  | Hash double  |
         |              | |              |  |              |
         |   Size: 96K  | | Size: 92K    |  |  Size:63K    |
         +--------------+ +--------------+  +--------------+
        /        |       \
       /         |        \
  +--------+ +--------+ +--------+
  | Single | | Chunks | | Large  |
  |        | |        | | Chunks |
  +--------+ +--------+ +--------+

The next table presents the mapping between resources and hardware entry types:

Resource Name Entry Type
Linear-Single Nexthops (Adjacency entries)
Linear-Chunks Multiple nexthops for multipath routes requiring up to 32 entries each
Linear-Large-Chunks Multiple nexthops for multipath routes requiring up to 512 entries each
Hash Single FDB, IPv4 routes, IPv4 neighbours, IPv6 routes with prefix length <= 64
Hash Double IPv6 routes with prefix length > 64, IPv6 neighbours

Spectrum-2's Resources

For Spectrum-2 ASIC the KVD memory is not divided as it is for Spectrum. There is only one KVD hash pool shared by all tables. The size is 512K entries.

Resource

The ASIC's resources can be observed per device:

$ devlink -v resource show pci/0000:03:00.0
pci/0000:03:00.0:
  name kvd resource_path /kvd size 258048 unit entry dpipe_tables none
    resources:
      name linear resource_path /kvd/linear size 98304 occ 0 unit entry size_min 0 size_max 159744 size_gran 128
        dpipe_tables:
          table_name mlxsw_adj
        resources:
          name singles resource_path /kvd/linear/singles size 16384 occ 0 unit entry size_min 0 size_max 159744 size_gran 1 dpipe_tables none
          name chunks resource_path /kvd/linear/chunks size 49152 occ 0 unit entry size_min 0 size_max 159744 size_gran 32 dpipe_tables none
          name large_chunks resource_path /kvd/linear/large_chunks size 32768 occ 0 unit entry size_min 0 size_max 159744 size_gran 512 dpipe_tables none
      name hash_double resource_path /kvd/hash_double size 65408 unit entry size_min 32768 size_max 192512 size_gran 128
        dpipe_tables:
          table_name mlxsw_host6
      name hash_single resource_path /kvd/hash_single size 94336 unit entry size_min 65536 size_max 225280 size_gran 128
        dpipe_tables:
          table_name mlxsw_host4

Note: Resource show and set commands are not supported on Spectrum-2 as it only supports a single KVD memory pool.

Resource Parameters

  • name - Resource's name
  • resource_path - The full path of the resource
  • size - The size of the resource
  • size_new - The pending size. Reload is required for the change to happen
  • size_valid - For nested resources only. In case the children's sizes exceed the size of the parent
  • unit - The units in which the size is represented
  • occ - Real time occupancy, represented in units. May not be available for all resources
  • size_min - Minimum size. Not available for constant size resources
  • size_max - Maximum size. Not available for constant size resources
  • size_gran - Required granularity of the size. Not available for constant size resources
  • dpipe_tables - The list of tables which use this resource
  • resources - The list of child resources

The resources are presented in a tree based structure in order to represent the nested property, with a file system like path.

Resources whose size can change will present size properties which include minimum, maximum and granularity.

Example for changing a resource's size:

$ devlink resource set pci/0000:03:00.0 path /kvd/hash_double size 65536

In order for the changes to take affect reload is required:

$ devlink dev reload pci/0000:03:00.0

The reload process performs hot reload of the driver, thus it is recommended to perform the desired changes before any other configuration. The reload will fail in case one of the nested resources in the hierarchy is marked as having an invalid size:

$ devlink resource show pci/0000:03:00.0
pci/0000:03:00.0:
  name kvd size 245760 unit entry dpipe_tables none size_valid false
  resources:
    name linear size 98304 size_new 147456 occ 0 unit entry size_min 0 size_max 147456 size_gran 128
    dpipe_tables:
      table_name mlxsw_adj

    name hash_double size 60416 unit entry size_min 32768 size_max 180224 size_gran 128
    dpipe_tables:
      table_name mlxsw_host6

    name hash_single size 87040 unit entry size_min 65536 size_max 212992 size_gran 128
    dpipe_tables:
      table_name mlxsw_host4

In this case the linear's size was changed, but the overall configuration is invalid (kvd is marked as having an invalid size).

Note: Resource show and set commands are not supported on Spectrum-2 as it only supports a single KVD memory pool.

TCAM

The device supports two types of TCAM (Ternary Content-Addressable Memory), ATCAM (Algorithmic TCAM) and CTCAM (Circuit TCAM). The second one is not part of the KVD and therefore is not displayed in resmon. There is no ATCAM in Spectrum-1, only CTCAM, so ATCAM resource is not relevant for Spectrum-1. In addition, when identical TC rules are inserted, they are pushed to the CTCAM so resmon cannot monitor them.

Dpipe Resource Interaction

Each dpipe table can be mapped to a single resource and specify the number of units it consumes for a single table entry. For example, the mlxsw_host6 table uses the /kvd/hash_double resource and consumes two units per table entry:

$ devlink dpipe table show pci/0000:03:00.0
pci/0000:03:00.0:
  name mlxsw_erif size 1000 counters_enabled false
  match:
    type field_exact header mlxsw_meta field erif_port mapping ifindex
  action:
    type field_modify header mlxsw_meta field l3_forward
    type field_modify header mlxsw_meta field l3_drop

  name mlxsw_host4 size 0 counters_enabled false resource_path /kvd/hash_single resource_units 1
  match:
    type field_exact header mlxsw_meta field erif_port mapping ifindex
    type field_exact header ipv4 field destination ip
  action:
    type field_modify header ethernet field destination mac

  name mlxsw_host6 size 0 counters_enabled false resource_path /kvd/hash_double resource_units 2
  match:
    type field_exact header mlxsw_meta field erif_port mapping ifindex
    type field_exact header ipv6 field destination ip
  action:
    type field_modify header ethernet field destination mac

  name mlxsw_adj size 0 counters_enabled false resource_path /kvd/linear resource_units 1
  match:
    type field_exact header mlxsw_meta field adj_index
    type field_exact header mlxsw_meta field adj_size
    type field_exact header mlxsw_meta field adj_hash_index
  action:
    type field_modify header ethernet field destination mac
    type field_modify header mlxsw_meta field erif_port mapping ifindex

Suggested Spectrum Resource Profiles

As the KVD can be re-partitioned using the devlink tool, here are several recommended sets of partition sizes that can be configured:

Profile Linear Linear Singles Linear Chunks Linear Large Chunks Hash Single Hash Double
Default 98304 16384 49152 32768 94208 64512
Scale 64512 16128 32000 16384 137216 56320
IPv4 Max 64512 16128 32000 16384 160768 32768

Note: This is not relevant for Spectrum-2 as it only supports a single KVD memory pool.

Resource Monitoring

Resource consumption in Spectrum switches is possible through the resmon tool. The tool works by observing EMAD traffic between the driver and the firmware, and making note of messages that correspond to resource allocations and deallocations. It therefore has a model of KVD occupancy, and can show current consumption broken down according to the resource type:

$ systemctl start resmon
$ devlink dev reload pci/0000:06:00.0
$ resmon stats
Resource                      Usage
IPv4 LPM                      29 / 524288 (0%)
IPv6 LPM                      35 / 524288 (0%)
ATCAM                         12 / 524288 (0%)
ACL Action Set                1008 / 524288 (0%)
IPv4 Host Table               6 / 524288 (0%)
IPv6 Host Table               0 / 524288 (0%)
Adjacency Table               0 / 524288 (0%)
FDB Entry                     74 / 524288 (0%)
Total                         1164 / 524288 (0%)

resmon is supported on Spectrum-2 and above. On Spectrum-1, KVD linear is managed by the driver itself, and therefore the releases are never seen. resmon will therefore work on Spectrum-1, but the KVDL data will be misleading. One possibility is to exclude KVDL-based resources from tracking:

$ resmon start exclude resources kvdl

(Or systemctl edit resmon.service to create an override to the same effect.)

In addition, on Spectrum-1, the KVD hash is split between the KVD hash single and KVD hash double. Therefore, the maximum capacity of each resource is not the size of the KVD as in Spectrum-2 and above, but the size of the KVD partition where the resource is stored.

See the README, resmon(8), resmon-start(8) and resmon-stats(8) for further information

Clone this wiki locally