Dpdk performance

hi , there is a question about drop rate about dpdk
my test env is:

  1. use dpef to send packet
  2. use lastest master branch of suricata
  3. Hyperscan is enabled

result is

[128315] 1/12/2022 -- 14:41:07 - (util-device.c:356) <Notice> (LiveDeviceListClean) -- Stats for '0000:3b:00.1':  pkts: 90749789, drop: 11726685 (12.92%), invalid chksum: 0

dperf packet setting is

mode                        client
protocol                    udp

cpu                         0 1
payload_size                500
duration                    60s
cc                          4000
keepalive                   2ms

#port                       pci             addr         gateway
port                        0000:81:00.1    6.6.241.27   6.6.241.1 00:22:11:22:22:11

#                           addr_start      num
client                      6.6.241.100     100

#                           addr_start      num
server                      6.6.241.29      2
[suricata_dpdk.yaml|attachment](upload://xSk0BcCfMoMOYU8rovZsklgsA7t.yaml) (80.6 KB)


#                           port_start      num
listen                      80              1

suricata yaml is uploaded

my question is how to reduce the drop rate? current only test upd packet.

cpu information

ids$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  32
  On-line CPU(s) list:   0-31
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
    CPU family:          6
    Model:               85
    Thread(s) per core:  2
    Core(s) per socket:  8
    Socket(s):           2
    Stepping:            4
    CPU max MHz:         3000.0000
    CPU min MHz:         800.0000
    BogoMIPS:            4200.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nons
                         top_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowpref
                         etch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512
                         f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d arc
                         h_capabilities
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   512 KiB (16 instances)
  L1i:                   512 KiB (16 instances)
  L2:                    16 MiB (16 instances)
  L3:                    22 MiB (2 instances)
NUMA:                    
  NUMA node(s):          2
  NUMA node0 CPU(s):     0-7,16-23
  NUMA node1 CPU(s):     8-15,24-31
Vulnerabilities:         
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
  Mds:                   Mitigation; Clear CPU buffers; SMT vulnerable
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT vulnerable
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
  Srbds:                 Not affected
  Tsx async abort:       Mitigation; Clear CPU buffers; SMT vulnerable

sudo perf top -p $(pidof suricata)

thanks in advance if any suggestion to find out the bottleneck

suricata_dpdk.yaml (80.6 KB)

i tried to increase the mempool size, but looks no usefull

mempool-size: 262143
mempool-cache-size: 511
rx-descriptors: 10240
tx-descriptors: 10240

Hi @Eason_Pan,

were you able to test Suricata with e.g. af-packet and verify that you don’t loose any packet with that capture interface. If not, would you be please able to test that?

Your suricata.yaml looks relatively good although I would try to use a more standard number of RX/TX descriptors (exponent of 2) (e.g. 4096/8192). The size of mempool (cache) size should be good. Also, I would locate all Suricata cores on one NUMA node, meaning place also management core on the same NUMA node as workers.

Can you please tell me how much bandwidth the dperf generates? I haven’t used dperf and cannot seem to easily derive that from the config. From the perf top it does not seem like much but it would certainly help me to make follow up decisions.

What NIC do you use?

Thanks,
Lukas

hi @lukashino
the card is:


Network devices using DPDK-compatible driver
============================================
0000:81:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' drv=igb_uio unused=ixgbe,vfio-pci,uio_pci_generic

i found that the drop rate depends on the cps, if the cpsis large, then the drop rate is large, if the cps is small , then the drop rate can be reduced to 0.

so what is the suggested script of trex or dperf to test if the system reach up 10G capacity

Hi,

sorry for the late reply.
That’s a good point, connections per second do make a great impact on the Suricata performance.

About the suggested trex script - I am sorry but I am not aware of any script that would rate Suricata setup to e.g. 10G. Traffic on networks can be vastly different and that’s why manual tuning on individual networks is usually recommended.