High Packet Drop Rate with DPDK compared to AF_PACKET in Suricata 7.0.7

I hope this message finds you well. I’m reaching out to seek assistance regarding a significant performance issue I’ve been experiencing with Suricata version 7.0.7 RELEASE running in IDS mode on my system. Specifically, I’m encountering a high packet drop rate (~45%) when operating in DPDK run mode, whereas the performance with AF_PACKET is notably better.


System Overview

  • Suricata Version: 7.0.7 installed from source
  • Operating Mode: IDS
  • Hardware Specifications:
    • CPU: 20 cores (cores 0-19)
  $ lscpu -e
  CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE    MAXMHZ   MINMHZ     MHZ
    0    0      0    0 0:0:0:0           si 4900,0000 800,0000 800.000
    1    0      0    0 0:0:0:0           si 4900,0000 800,0000 800.000
    2    0      0    1 4:4:1:0           si 4900,0000 800,0000 800.000
    3    0      0    1 4:4:1:0           si 4900,0000 800,0000 800.000
    4    0      0    2 8:8:2:0           si 4900,0000 800,0000 800.000
    5    0      0    2 8:8:2:0           si 4900,0000 800,0000 800.000
    6    0      0    3 12:12:3:0         si 4900,0000 800,0000 800.000
    7    0      0    3 12:12:3:0         si 4900,0000 800,0000 800.000
    8    0      0    4 16:16:4:0         si 5000,0000 800,0000 800.000
    9    0      0    4 16:16:4:0         si 5000,0000 800,0000 800.574
   10    0      0    5 20:20:5:0         si 5000,0000 800,0000 800.000
   11    0      0    5 20:20:5:0         si 5000,0000 800,0000 800.000
   12    0      0    6 24:24:6:0         si 4900,0000 800,0000 848.286
   13    0      0    6 24:24:6:0         si 4900,0000 800,0000 800.000
   14    0      0    7 28:28:7:0         si 4900,0000 800,0000 800.000
   15    0      0    7 28:28:7:0         si 4900,0000 800,0000 800.000
   16    0      0    8 36:36:9:0         si 3800,0000 800,0000 800.000
   17    0      0    9 37:37:9:0         si 3800,0000 800,0000 800.000
   18    0      0   10 38:38:9:0         si 3800,0000 800,0000 800.000
   19    0      0   11 39:39:9:0         si 3800,0000 800,0000 800.001
  • Network Interface: Intel X540-T2 with vfio-pci driver.
  • PCI Address: 0000:05:00.0 (Cause of the vfio-pci, it has no IP address)
  • OS: ubuntu 22.04
  • Memory: 64 GB
    • HugePages: 4096 GB
  $  grep Huge /proc/meminfo 
AnonHugePages:         0 kB
ShmemHugePages:     8192 kB
FileHugePages:         0 kB
HugePages_Total:    4096
HugePages_Free:     4095
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:         8388608 kB
  • Files:
    • suricata.yaml: Attached
    • suricata.log: Attached

Suricata Configuration Highlights

Below are the key configurations from my suricata.yaml that pertain to this issue:

dpdk:
  eal-params:
    proc-type: primary
    allow: ["0000:05:00.0"]
  interfaces:
    - interface: 0000:05:00.0
      threads: 8
      promisc: true
      multicast: false
      checksum-checks: false
      checksum-checks-offload: false
      mtu: 1500
      mempool-size: 262144
      mempool-cache-size: 512
      rx-descriptors: 4096 
      tx-descriptors: 4096 
      copy-mode: none
      copy-iface: none
      rss-hash-functions: auto

threading:
  set-cpu-affinity: yes
  cpu-affinity:
    - management-cpu-set:
        cpu: [16,17]
    - receive-cpu-set:
        cpu: [18]
    - verdict-cpu-set:
        cpu: [19]
    - worker-cpu-set:
        cpu: [0,2,4,6,8,10,12,14]
        mode: exclusive
        prio:
          default: high
  detect-thread-ratio: 1.0
  stack-size: 8mb

Additional Notable Configurations:

  • Mempool Configuration:
    • mempool-size: 262144
    • mempool-cache-size: 512
  • RX and TX Descriptors:
    • Both set to 4,096 per queue
  • Threading:
    • 8 worker threads assigned to cores 0,2,4,6,8,10,12,14
    • Management, receive, and verdict threads assigned to cores 16, 17, 18, and 19 respectively
  • App-Layer Protocols:
    • Multiple protocols enabled (HTTP, TLS, SSH, etc.) with specific detection ports
  • Runmode: Workers

Observed Performance Issues

When running Suricata in DPDK mode, the following metrics were observed from the logs:

  • Total Packets Received: ~17,559,078
  • Packets Dropped (rx_missed_errors): ~7,943,058
  • Packet Drop Percentage: Approximately 45.24%

Comparison with AF_PACKET Mode:

  • In AF_PACKET mode, the packet drop rate is significantly lower, and overall performance is more stable and efficient.
    • Total packets: 17,796,504
    • Drops: 3,787,138
    • Percentage: 21.28%

Requests for Assistance

Given the complexity of the issue and the critical nature of maintaining low packet drop rates for effective intrusion detection, I kindly request the community’s assistance with the following:

  1. Configuration Review:

    • Please review the attached suricata.yaml and suricata.log files to identify any misconfigurations or areas for optimization that I might have overlooked.
  2. Additional Optimization Tips:

    • Any other settings or optimizations that could help reduce the packet drop rate and enhance Suricata’s performance in DPDK mode.

Attached Files

For your reference and detailed analysis, I have attached the following files:

  1. suricata.yaml: Comprehensive configuration file outlining all current settings.
  2. suricata.log: Log output capturing the initialization, configuration, and performance metrics during a run in DPDK mode.

Best regards,
Álvaro

suricata.log (76.7 KB)
suricata.yaml (88.2 KB)

Some comments:

  1. You actually have 8 GB of hugepage memory allocated - you have allocated 4096 hugepages of size 2048 kB == 8 GB
  2. In your CPU setting it would be good to know what is your CPU and if you have Hyperthreading enabled. If you have HT enabled then it would be good to determine what are the Hyperthreaded cores. You can determine the CPU pairs by looking into /proc/cpuinfo and by pairing core id. So if you only use 8 cores use the independent cores only. Use of Hyperthreaded cores may boost the performance a little but 2 individual cores will always be better than 2 hyperthreaded ones.
  3. This CPU alignment applies to management CPUs as well (although to a lesser degree since the operation there is not so demanding)
  4. Receive/verdict CPU sets are not used in workers runmode
  5. I would set mempool size to 262143 and mp cache to 511 (there is some DPDK internal math that suggest these numbers but this very likely won’t cause the issue)
  6. With my experience with DPDK, it is better to set RX/TX descriptors to 32768. However, Intel cards (at least X710) support setting only 4096 descriptors. It seems like they are not a good fit for Suricata. I am not sure how it is with X540 but I assume it will be the same - try setting 32768 RX/TX descriptors, run Suricata in a very verbose mode (-vvvv) and observe if it says something about lowering down the descriptors.

Hi Lukas. Thanks for answering

  1. I set 8 GB of HugePages due to i saw in the suricata.log this sentence
[181614 - Suricata-Main] 2024-10-14 13:40:37 Perf: hugepages: Hugepages on NUMA node 0 can be set to 348 (only using 302/4095 2048kB hugepages)

So i reduce the size. However, i setup the Hugepages size to 20 GB, but the performance of the executions does not improve.
Due to your answer, i have set the Hugepages size to 20 GB again, and the log got in the suricata.log is the next:

Perf: hugepages: Hugepages on NUMA node 0 can be set to 347 (only using 301/20479 2048kB hugepages) [SystemHugepageEvaluateHugepages:util-hugepages.c:406]

2, 3. Yes. I have Hyperthreading enabled in the CPU. The caracteristics are:

  • Model: 12th Gen Intel(R) Core(TM) i7-12700K
  • Phyical Cores: 12
  • Total threads: 20
  • Hyperthreading: enabled
  • Hyperthreaded cores: (Core ID: Processor)
    • 0: Processors 0 & 1
    • 4: Processors 2 & 3
    • 8: Processors 4 & 5
    • 12: Processors 6 & 7
    • 16: Processors 8 & 9
    • 20: Processors 10 & 11
    • 24: Processors 12 & 13
    • 28: Processors 14 & 15
  • Independent cores (Non-Hyperthreaded): (Core ID: Processor)
    • 36: Processor 16
    • 37: Processor 17
    • 38: Processor 18
    • 39: Processor 19

Summary:

  • Total Physical Cores: 12
    • 8 Performance (P) Cores with Hyperthreading: Core IDs 0, 4, 8, 12, 16, 20, 24, 28
    • 4 Efficient (E) Cores without Hyperthreading: Core IDs 36, 37, 38, 39

To assign the cores in the cpu-affininty, i though that enabling the physical processor instead of hyperthreading would be better. But now i don’t know what to put in the cpu-affinity. Can you help me?

Trying to make a first approach, i though it would be better assign the P-cores to the workers cpu set, and the E-cores to the management, leaving this config:

threading:
  cpu-affinity:
  - management-cpu-set:
      cpu: [ 16-19 ]
  - worker-cpu-set:
      cpu: [ 0,2,4,6,8,10,12,14 ]
      mode: exclusive
      prio:
        default: high

I would be very grateful if you could tell me what is the best option.

  1. Okay. I have delete the both config from the cpu-affinity section
  2. Values set as you say, but it does not seem to be the root cause of the low performance
  3. When setting the RX/TX descriptors to 32768, it happens what you said. In the suricata.log i can see this:
[7722 - Suricata-Main] 2024-10-16 11:39:17 Warning: dpdk: 0000:05:00.0: device queue descriptors adjusted (RX: from 32768 to 4096, TX: from 32768 to 4096)

So the NIC is also downgrading the RX/TX descriptors to 4096. Maybe a solution to reach a better performance with DPDK mode would be to change to other NIC which allows the adjustment of RX/TX descriptors?

Hope you can help me.

Thank you very much!

Attached files