dpdk-16.04 igb crc length statistical problem

created at 11-21-2021 views: 1

Problem Description

For the i350 igb electrical port, call dpdk rte_eth_stats_get to obtain the statistics of the bytes sent by the interface. The length of crc len is missing for each packet, which causes the bps calculated based on this statistics to be inaccurate.

problem analysis

The bottom driver obtains the statistics of packet bytes sent and received by the network card
The underlying statistical function of the igb driver is the eth_igb_stats_get function, which realizes the function by reading the relevant registers of the network card statistics. The byte statistics codes of the received and sent packets returned to the upper layer are as follows:

    rte_stats->ibytes   = stats->gorc;
    rte_stats->obytes   = stats->gotc;

The code logic at the bottom read register is as follows:

/* Workaround CRC bytes included in size, take away 4 bytes/packet */
    stats->gorc += E1000_READ_REG(hw, E1000_GORCL);
    stats->gorc += ((uint64_t)E1000_READ_REG(hw, E1000_GORCH) << 32);
    stats->gorc -= (stats->gprc - old_gprc) * ETHER_CRC_LEN;
    stats->gotc += E1000_READ_REG(hw, E1000_GOTCL);
    stats->gotc += ((uint64_t)E1000_READ_REG(hw, E1000_GOTCH) << 32);
    stats->gotc -= (stats->gptc - old_gptc) * ETHER_CRC_LEN;

In the above logic, every message sent and received is [reduce crc len length] bytes. The comment of the code indicates that this part of the logic is just to avoid the problem of CRC bytes being calculated into the length of each message.

dpdk has a configuration function for whether the network card is strip crc or not-hw_strip_crc, the default is 0, indicating that the network card does not strip crc, and setting it to 1 indicates that the network card enables the strip crc function.

The impact of hw_strip_crc in igb driver

1. Impact on hardware

In the up interface, when using the igb network card, dpdk will call the eth_igb_rx_init function. In this function, hw_strip_crc is judged and the hardware status is set according to the judgment result.

The relevant code is as follows:

    /* Setup the Receive Control Register. */
            if (dev->data->dev_conf.rxmode.hw_strip_crc) {
                    rctl |= E1000_RCTL_SECRC; /* Strip Ethernet CRC. */

                    /* set STRCRC bit in all queues */
                    if (hw->mac.type == e1000_i350 ||
                        hw->mac.type == e1000_i210 ||
                        hw->mac.type == e1000_i211 ||
                        hw->mac.type == e1000_i354) {
                            for (i = 0; i < dev->data->nb_rx_queues; i++) {
                                    rxq = dev->data->rx_queues[i];
                                    uint32_t dvmolr = E1000_READ_REG(hw,
                                            E1000_DVMOLR(rxq->reg_idx));
                                    dvmolr |= E1000_DVMOLR_STRCRC;
                                    E1000_WRITE_REG(hw, E1000_DVMOLR(rxq->reg_idx), dvmolr);
                            }
                    }
            } else {
                    rctl &= ~E1000_RCTL_SECRC; /* Do not Strip Ethernet CRC. */

                    /* clear STRCRC bit in all queues */
                    if (hw->mac.type == e1000_i350 ||
                        hw->mac.type == e1000_i210 ||
                        hw->mac.type == e1000_i211 ||
                        hw->mac.type == e1000_i354) {
                            for (i = 0; i < dev->data->nb_rx_queues; i++) {
                                    rxq = dev->data->rx_queues[i];
                                    uint32_t dvmolr = E1000_READ_REG(hw,
                                            E1000_DVMOLR(rxq->reg_idx));
                                    dvmolr &= ~E1000_DVMOLR_STRCRC;
                                    E1000_WRITE_REG(hw, E1000_DVMOLR(rxq->reg_idx), dvmolr);
                            }
                    }
            }

The above logic indicates that in the igb network card dpdk pmd driver, the configuration of hw_strip_crc will be used to set the network card [receiving control register] and each [configuration register of the packet receiving queue].

Our program turns off hw_strip_crc by default. In this case, the network card does not strip crc. At the same time, when obtaining the packet byte statistics, the crc length is subtracted for each received packet. This behavior is consistent with the comment content. However, when hw_strip_crc is enabled, the crc length is still subtracted for each packet in the received packet byte statistics. There is a problem here.

The preliminary explanation is that the strip crc of the network card does not reduce the crc length of each packet on the hardware side. The byte statistics of the packet have nothing to do with whether the hw_strip_crc function is enabled.

Use testpmd to test:

Close crc strip

   testpmd> start
     io packet forwarding - CRC stripping disabled - packets/burst=32
     nb forwarding cores=1 - nb forwarding ports=1
     RX queues=1 - RX desc=128 - RX free threshold=32
     RX threshold registers: pthresh=8 hthresh=8 wthresh=4
     TX queues=1 - TX desc=512 - TX free threshold=0
     TX threshold registers: pthresh=8 hthresh=1 wthresh=16
     TX RS bit threshold=0 - TXQ flags=0x0
   testpmd> show port stats all

     ######################## NIC statistics for port 0  ########################
     RX-packets: 0          RX-missed: 0          RX-bytes:  0
     RX-errors: 0
     RX-nombuf:  0
     TX-packets: 0          TX-errors: 0          TX-bytes:  0
     ############################################################################
   testpmd> show port stats all

     ######################## NIC statistics for port 0  ########################
     RX-packets: 3          RX-missed: 0          RX-bytes:  180
     RX-errors: 0
     RX-nombuf:  0
     TX-packets: 3          TX-errors: 0          TX-bytes:  180
     ############################################################################

The opposite end sends out 3 64-byte packets, the length of crc_len is reduced.

Open crc strip

 testpmd> start
   io packet forwarding - CRC stripping enabled - packets/burst=32
   nb forwarding cores=1 - nb forwarding ports=1
   RX queues=1 - RX desc=128 - RX free threshold=32
   RX threshold registers: pthresh=8 hthresh=8 wthresh=4
   TX queues=1 - TX desc=512 - TX free threshold=0
   TX threshold registers: pthresh=8 hthresh=1 wthresh=16
   TX RS bit threshold=0 - TXQ flags=0x0

 testpmd> show port stats 0

   ######################## NIC statistics for port 0  ########################
   RX-packets: 6          RX-missed: 0          RX-bytes:  360
   RX-errors: 0
   RX-nombuf:  0
   TX-packets: 6          TX-errors: 0          TX-bytes:  360
   ############################################################################
 testpmd> show port stats 0

   ######################## NIC statistics for port 0  ########################
   RX-packets: 9          RX-missed: 0          RX-bytes:  540
   RX-errors: 0
   RX-nombuf:  0
   TX-packets: 9          TX-errors: 0          TX-bytes:  540
   ############################################################################

The opposite end sends out 3 64-byte packets, the length of crc_len is reduced, which is consistent with the effect of turning off crc strip, indicating that the guess is reasonable.

2. Impact on software

There is the following code in the eth_igb_rx_init function:

 rxq->crc_len = (uint8_t)(dev->data->dev_conf.rxmode.hw_strip_crc ?
                                                         0 : ETHER_CRC_LEN);

This code uses the hw_strip_crc configuration to determine whether crc_len is subtracted from the 

packet receiving queue

When hw_strip_crc is turned on, the length of rxq->crc_len is assigned as 0, which means that this part of the length does not need to be subtracted, and this part of the work is done by the network card.
When hw_strip_crc is turned off, rxq->crc_len is assigned ETHER_CRC_LEN to subtract the length of crc_len in the packet receiving logic. The packet length calculated here will be filled into the pkt_len field of the mbuf where the packet is located.
Processing of crc len when sending a package
The CRC of the message needs to be filled when sending the packet, and there is no special treatment. In the igb dpdk pmd driver, the CRC length of each sent packet is subtracted from the packet byte statistics.

solution

Modify the igb network card to obtain the network card statistics code and cancel the logic of subtracting the crc len of each sent packet. Modify patch as follows:

ndex: drivers/net/e1000/igb_ethdev.c
===================================================================
--- drivers/net/e1000/igb_ethdev.c     
+++ drivers/net/e1000/igb_ethdev.c
@@ -1729,12 +1729,13 @@
        /* Both registers clear on the read of the high dword */

        /* Workaround CRC bytes included in size, take away 4 bytes/packet */
+       /* included CRC length to fix igb netcard bps leak */
        stats->gorc += E1000_READ_REG(hw, E1000_GORCL);
        stats->gorc += ((uint64_t)E1000_READ_REG(hw, E1000_GORCH) << 32);
-        stats->gorc -= (stats->gprc - old_gprc) * ETHER_CRC_LEN;
+       /* stats->gorc -= (stats->gprc - old_gprc) * ETHER_CRC_LEN; */
        stats->gotc += E1000_READ_REG(hw, E1000_GOTCL);
        stats->gotc += ((uint64_t)E1000_READ_REG(hw, E1000_GOTCH) << 32);
-       stats->gotc -= (stats->gptc - old_gptc) * ETHER_CRC_LEN;
+       /* stats->gotc -= (stats->gptc - old_gptc) * ETHER_CRC_LEN; */

        stats->rnbc += E1000_READ_REG(hw, E1000_RNBC);
        stats->ruc += E1000_READ_REG(hw, E1000_RUC);

How do other network cards deal with the hw_strip_crc configuration?
ixgbe: consistent with Igb processing, hardware + software

i40e: Only used to set rxq->crc_len, no hardware related configuration

ice: same as i40e

created at:11-21-2021
edited at: 11-21-2021: