Vector PMD uses IntelĀ® SIMD instructions to optimize packet I/O. It improves load/store bandwidth efficiency of L1 data cache by using a wider SSE/AVX register 1 (1). The wider register gives space to hold multiple packet buffers so as to save instruction number when processing bulk of packets.
There is no change to PMD API. The RX/TX handler are the only two entries for vPMD packet I/O. They are transparently registered at runtime RX/TX execution if all condition checks pass.
Some constraints apply as pre-conditions for specific optimizations on bulk packet transfers. The following sections explain RX and TX constraints in the vPMD.
The following prerequisites apply:
Ensure that the following pre-conditions are satisfied:
These conditions are checked in the code.
Scattered packets are not supported in this mode. If an incoming packet is greater than the maximum acceptable length of one “mbuf” data size (by default, the size is 2 KB), vPMD for RX would be disabled.
By default, IXGBE_MAX_RING_DESC is set to 4096 and RTE_PMD_IXGBE_RX_MAX_BURST is set to 32.
Some features are not supported when trying to increase the throughput in vPMD. They are:
Other features are supported using optional MACRO configuration. They include:
To guarantee the constraint, capabilities in dev_conf.rxmode.offloads will be checked:
fdir_conf->mode will also be checked.
As vPMD is focused on high throughput, it assumes that the RX burst size is equal to or greater than 32 per burst. It returns zero if using nb_pkt < 32 as the expected packet number in the receive handler.
The only prerequisite is related to tx_rs_thresh. The tx_rs_thresh value must be greater than or equal to RTE_PMD_IXGBE_TX_MAX_BURST, but less or equal to RTE_IXGBE_TX_MAX_FREE_BUF_SZ. Consequently, by default the tx_rs_thresh value is in the range 32 to 64.
TX vPMD only works when offloads is set to 0
This means that it does not support any TX offload.
In DPDK release v16.11 an API for ixgbe specific functions has been added to the ixgbe PMD. The declarations for the API functions are in the header rte_pmd_ixgbe.h.
When running l3fwd with vPMD, there is one thing to note. In the configuration, ensure that DEV_RX_OFFLOAD_CHECKSUM in port_conf.rxmode.offloads is NOT set. Otherwise, by default, RX vPMD is disabled.
As in the case of l3fwd, to enable vPMD, do NOT set DEV_RX_OFFLOAD_CHECKSUM in port_conf.rxmode.offloads. In addition, for improved performance, use -bsz “(32,32),(64,64),(32,32)” in load_balancer to avoid using the default burst size of 144.
The Intel x550 series NICs support a feature called MDD (Malicious Driver Detection) which checks the behavior of the VF driver. If this feature is enabled, the VF must use the advanced context descriptor correctly and set the CC (Check Context) bit. DPDK PF doesn’t support MDD, but kernel PF does. We may hit problem in this scenario kernel PF + DPDK VF. If user enables MDD in kernel PF, DPDK VF will not work. Because kernel PF thinks the VF is malicious. But actually it’s not. The only reason is the VF doesn’t act as MDD required. There’s significant performance impact to support MDD. DPDK should check if the advanced context descriptor should be set and set it. And DPDK has to ask the info about the header length from the upper layer, because parsing the packet itself is not acceptable. So, it’s too expensive to support MDD. When using kernel PF + DPDK VF on x550, please make sure to use a kernel PF driver that disables MDD or can disable MDD.
Some kernel drivers already disable MDD by default while some kernels can use the command insmod ixgbe.ko MDD=0,0 to disable MDD. Each “0” in the command refers to a port. For example, if there are 6 ixgbe ports, the command should be changed to insmod ixgbe.ko MDD=0,0,0,0,0,0.
The statistics of ixgbe hardware must be polled regularly in order for it to remain consistent. Running a DPDK application without polling the statistics will cause registers on hardware to count to the maximum value, and “stick” at that value.
In order to avoid statistic registers every reaching the maximum value, read the statistics from the hardware using rte_eth_stats_get() or rte_eth_xstats_get().
The maximum time between statistics polls that ensures consistent results can be calculated as follows:
max_read_interval = UINT_MAX / max_packets_per_second
max_read_interval = 4294967295 / 14880952
max_read_interval = 288.6218096127183 (seconds)
max_read_interval = ~4 mins 48 sec.
In order to ensure valid results, it is recommended to poll every 4 minutes.
Although the user can set the MTU separately on PF and VF ports, the ixgbe NIC only supports one global MTU per physical port. So when the user sets different MTUs on PF and VF ports in one physical port, the real MTU for all these PF and VF ports is the largest value set. This behavior is based on the kernel driver behavior.
On ixgbe, the concept of “pool” can be used for different things depending on the mode. In VMDq mode, “pool” means a VMDq pool. In IOV mode, “pool” means a VF.
There is no RTE API to add a VF’s MAC address from the PF. On ixgbe, the rte_eth_dev_mac_addr_add() function can be used to add a VF’s MAC address, as a workaround.
X550 cannot get interrupts if using uio_pci_generic module or using legacy interrupt mode of igb_uio or vfio. Because the errata of X550 states that the Interrupt Status bit is not implemented. The errata is the item #22 from X550 spec update
When using uio_pci_generic module or using legacy interrupt mode of igb_uio or vfio, the Interrupt Status bit would be checked if the interrupt is coming. Since the bit is not implemented in X550, the irq cannot be handled correctly and cannot report the event fd to DPDK apps. Then apps cannot get interrupts and dmesg will show messages like irq #No.: `` ``nobody cared.
Do not bind the uio_pci_generic module in X550 NICs. Do not bind igb_uio with legacy mode in X550 NICs. Before binding vfio with legacy mode in X550 NICs, use modprobe vfio `` ``nointxmask=1 to load vfio module if the intx is not shared with other devices.
Inline IPsec processing is supported for RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO mode for ESP packets only:
IPsec Security Gateway Sample Application supports inline IPsec processing for ixgbe PMD.
For more details see the IPsec Security Gateway Sample Application and Security library documentation.
The IXGBE PF PMD supports the creation of VF port representors for the control and monitoring of IXGBE virtual function devices. Each port representor corresponds to a single virtual function of that device. Using the devargs option representor the user can specify which virtual functions to create port representors for on initialization of the PF PMD by passing the VF IDs of the VFs which are required.:
-w DBDF,representor=[0,1,4]
Currently hot-plugging of representor ports is not supported so all required representors must be specified on the creation of the PF.