MQPRIO(8) Linux MQPRIO(8)

MQPRIO - Multiqueue Priority Qdisc (Offloaded Hardware QOS)

tc qdisc ... dev dev ( parent classid | root) [ handle major: ] mqprio [ num_tc tcs ] [ map P0 P1 P2... ] [ queues count1@offset1 count2@offset2 ... ] [ hw 1|0 ] [ mode dcb|channel ] [ shaper dcb|bw_rlimit ] [ min_rate min_rate1 min_rate2 ... ] [ max_rate max_rate1 max_rate2 ... ] [ fp FP0 FP1 FP2 ... ]

The MQPRIO qdisc is a simple queuing discipline that allows mapping traffic flows to hardware queue ranges using priorities and a configurable priority to traffic class mapping. A traffic class in this context is a set of contiguous qdisc classes which map 1:1 to a set of hardware exposed queues.

By default the qdisc allocates a pfifo qdisc (packet limited first in, first out queue) per TX queue exposed by the lower layer device. Other queuing disciplines may be added subsequently. Packets are enqueued using the map parameter and hashed across the indicated queues in the offset and count. By default these parameters are configured by the hardware driver to match the hardware QOS structures.

Channel mode supports full offload of the mqprio options, the traffic classes, the queue configurations and QOS attributes to the hardware. Enabled hardware can provide hardware QOS with the ability to steer traffic flows to designated traffic classes provided by this qdisc. Hardware based QOS is configured using the shaper parameter. bw_rlimit with minimum and maximum bandwidth rates can be used for setting transmission rates on each traffic class. Also further qdiscs may be added to the classes of MQPRIO to create more complex configurations.

On creation with 'tc qdisc add', eight traffic classes are created mapping priorities 0..7 to traffic classes 0..7 and priorities greater than 7 to traffic class 0. This requires base driver support and the creation will fail on devices that do not support hardware QOS schemes.

These defaults can be overridden using the qdisc parameters. Providing the 'hw 0' flag allows software to run without hardware coordination.

If hardware coordination is being used and arguments are provided that the hardware can not support then an error is returned. For many users hardware defaults should work reasonably well.

As one specific example numerous Ethernet cards support the 802.1Q link strict priority transmission selection algorithm (TSA). MQPRIO enabled hardware in conjunction with the classification methods below can provide hardware offloaded support for this TSA.

Multiple methods are available to set the SKB priority which MQPRIO uses to select which traffic class to enqueue the packet.

A process with sufficient privileges can encode the destination class directly with SO_PRIORITY, see socket(7).
An iptables/nftables rule can be created to match traffic flows and set the priority. iptables(8)
The net_prio cgroup can be used to set the priority of all sockets belong to an application. See kernel and cgroup documentation for details.

Number of traffic classes to use. Up to 16 classes supported. You cannot have more classes than queues
The priority to traffic class map. Maps priorities 0..15 to a specified traffic class.
Provide count and offset of queue range for each traffic class. In the format, count@offset. Queue ranges for each traffic classes cannot overlap and must be a contiguous range of queues.
Set to 1 to support hardware offload. Set to 0 to configure user specified values in software only. The default value of this parameter is 1
Set to channel for full use of the mqprio options. Use dcb to offload only TC values and use hardware QOS defaults. Supported with 'hw' set to 1 only.
Use bw_rlimit to set bandwidth rate limits for a traffic class. Use dcb for hardware QOS defaults. Supported with 'hw' set to 1 only.
Minimum value of bandwidth rate limit for a traffic class. Supported only when the 'shaper' argument is set to 'bw_rlimit'.
Maximum value of bandwidth rate limit for a traffic class. Supported only when the 'shaper' argument is set to 'bw_rlimit'.
Selects whether traffic classes are express (deliver packets via the eMAC) or preemptible (deliver packets via the pMAC), according to IEEE 802.1Q-2018 clause 6.7.2 Frame preemption. Takes the form of an array (one element per traffic class) with values being 'E' (for express) or 'P' (for preemptible).

Multiple priorities which map to the same traffic class, as well as multiple TXQs which map to the same traffic class, must have the same FP attributes. To interpret the FP as an attribute per priority, the 'map' argument can be used for translation. To interpret FP as an attribute per TXQ, the 'queues' argument can be used for translation.

Traffic classes are express by default. The argument is supported only with 'hw' set to 1. Preemptible traffic classes are accepted only if the device has a MAC Merge layer configurable through ethtool(8).

ethtool(8)

The following example shows how to attach priorities to 4 traffic classes ("num_tc 4"), and then how to pair these traffic classes with 4 hardware queues with mqprio, with hardware coordination ("hw 1", or does not specified, because 1 is the default value). Traffic class 0 (tc0) is mapped to hardware queue 0 (q0), tc1 is mapped to q1, tc2 is mapped to q2, and tc3 is mapped q3.

# tc qdisc add dev eth0 root mqprio               num_tc 4               map 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3               queues 1@0 1@1 1@2 1@3               hw 1

The next example shows how to attach priorities to 3 traffic classes ("num_tc 3"), and how to pair these traffic classes with 4 queues, without hardware coordination ("hw 0"). Traffic class 0 (tc0) is mapped to hardware queue 0 (q0), tc1 is mapped to q1, tc2 and is mapped to q2 and q3, where the queue selection between these two queues is somewhat randomly decided.

# tc qdisc add dev eth0 root mqprio               num_tc 3               map 0 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2               queues 1@0 1@1 2@2               hw 0

In both cases from above the priority values from 0 to 3 (prio0-3) are mapped to tc0, prio4-7 are mapped to tc1, and the prio8-11 are mapped to tc2 ("map" attribute). The last four priority values (prio12-15) are mapped in different ways in the two examples. They are mapped to tc3 in the first example and mapped to tc2 in the second example. The values of these two examples are the following:


┌────┬────┬───────┐ ┌────┬────┬────────┐
│Prio│ tc │ queue │ │Prio│ tc │ queue │
├────┼────┼───────┤ ├────┼────┼────────┤
│ 0 │ 0 │ 0 │ │ 0 │ 0 │ 0 │
│ 1 │ 0 │ 0 │ │ 1 │ 0 │ 0 │
│ 2 │ 0 │ 0 │ │ 2 │ 0 │ 0 │
│ 3 │ 0 │ 0 │ │ 3 │ 0 │ 0 │
│ 4 │ 1 │ 1 │ │ 4 │ 1 │ 1 │
│ 5 │ 1 │ 1 │ │ 5 │ 1 │ 1 │
│ 6 │ 1 │ 1 │ │ 6 │ 1 │ 1 │
│ 7 │ 1 │ 1 │ │ 7 │ 1 │ 1 │
│ 8 │ 2 │ 2 │ │ 8 │ 2 │ 2 or 3 │
│ 9 │ 2 │ 2 │ │ 9 │ 2 │ 2 or 3 │
│ 10 │ 2 │ 2 │ │ 10 │ 2 │ 2 or 3 │
│ 11 │ 2 │ 2 │ │ 11 │ 2 │ 2 or 3 │
│ 12 │ 3 │ 3 │ │ 12 │ 2 │ 2 or 3 │
│ 13 │ 3 │ 3 │ │ 13 │ 2 │ 2 or 3 │
│ 14 │ 3 │ 3 │ │ 14 │ 2 │ 2 or 3 │
│ 15 │ 3 │ 3 │ │ 15 │ 2 │ 2 or 3 │
└────┴────┴───────┘ └────┴────┴────────┘
example1 example2

Another example of queue mapping is the following. There are 5 traffic classes, and there are 8 hardware queues.

# tc qdisc add dev eth0 root mqprio               num_tc 5               map 0 0 0 1 1 1 1 2 2 3 3 4 4 4 4 4               queues 1@0 2@1 1@3 1@4 3@5

The value mapping is the following for this example:


┌───────┐
tc0────┤Queue 0│◄────1@0
├───────┤
┌─┤Queue 1│◄────2@1
tc1──┤ ├───────┤
└─┤Queue 2│
├───────┤
tc2────┤Queue 3│◄────1@3
├───────┤
tc3────┤Queue 4│◄────1@4
├───────┤
┌─┤Queue 5│◄────3@5
│ ├───────┤
tc4──┼─┤Queue 6│
│ ├───────┤
└─┤Queue 7│
└───────┘

John Fastabend, <john.r.fastabend@intel.com>

24 Sept 2013 iproute2