All packets drop in kernel in some threads

$ sudo strace -p 36588
strace: Process 36588 attached
getsockopt(34, SOL_PACKET, PACKET_STATISTICS, {packets=6423, drops=6423}, [8]) = 0
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
getsockopt(34, SOL_PACKET, PACKET_STATISTICS, {packets=6882, drops=6882}, [8]) = 0
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
getsockopt(34, SOL_PACKET, PACKET_STATISTICS, {packets=7164, drops=7164}, [8]) = 0
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
poll([{fd=34, events=POLLIN}], 1, 100)  = 0 (Timeout)
getsockopt(34, SOL_PACKET, PACKET_STATISTICS, {packets=7041, drops=7041}, [8]) = 0
^Cstrace: Process 36588 detached
$ top -H -b -p `pgrep Suricata`
top - 13:45:21 up 15 days, 20:40,  2 users,  load average: 7.00, 6.86, 7.54
Threads:  70 total,   5 running,  65 sleeping,   0 stopped,   0 zombie
%Cpu(s):  9.6 us,  1.4 sy,  0.0 ni, 88.4 id,  0.0 wa,  0.0 hi,  0.6 si,  0.0 st
KiB Mem : 26335340+total, 25208035+free,  5493104 used,  5779944 buff/cache
KiB Swap:  8388604 total,  8388604 free,        0 used. 25676942+avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
36609 root      20   0 6678872   2.4g 424532 S 26.7  1.0   2:21.49 W#49-eth0
36567 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:17.97 W#07-eth0
36575 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:20.84 W#15-eth0
36577 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:19.99 W#17-eth0
36578 root      20   0 6678872   2.4g 424532 R 20.0  1.0   2:18.53 W#18-eth0
36582 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:21.74 W#22-eth0
36591 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:19.72 W#31-eth0
36594 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:16.99 W#34-eth0
36596 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:14.65 W#36-eth0
36599 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:20.11 W#39-eth0
36605 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:18.35 W#45-eth0
36607 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:14.11 W#47-eth0
36613 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:17.34 W#52-eth0
36614 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:15.37 W#53-eth0
36618 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:15.77 W#57-eth0
36623 root      20   0 6678872   2.4g 424532 S 20.0  1.0   2:17.22 W#62-eth0
36569 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:18.53 W#09-eth0
36576 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:20.52 W#16-eth0
36579 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:17.31 W#19-eth0
36580 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:15.98 W#20-eth0
36583 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:19.47 W#23-eth0
36584 root      20   0 6678872   2.4g 424532 R 13.3  1.0   2:15.21 W#24-eth0
36586 root      20   0 6678872   2.4g 424532 R 13.3  1.0   2:31.50 W#26-eth0
36587 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:19.38 W#27-eth0
36590 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:17.85 W#30-eth0
36593 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:17.93 W#33-eth0
36595 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:17.95 W#35-eth0
36597 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:20.89 W#37-eth0
36598 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:15.95 W#38-eth0
36601 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:17.85 W#41-eth0
36602 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:18.17 W#42-eth0
36603 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:17.79 W#43-eth0
36604 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:15.20 W#44-eth0
36608 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:22.09 W#48-eth0
36610 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:18.07 W#50-eth0
36612 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:14.91 W#51-eth0
36615 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:18.20 W#54-eth0
36620 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:17.90 W#59-eth0
36621 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:21.19 W#60-eth0
36622 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:16.85 W#61-eth0
36624 root      20   0 6678872   2.4g 424532 S 13.3  1.0   2:25.81 W#63-eth0
36625 root      20   0 6678872   2.4g 424532 R 13.3  1.0   2:20.32 W#64-eth0
36562 root      20   0 6678872   2.4g 424532 S  6.7  1.0   0:19.12 W#02-eth0
36570 root      20   0 6678872   2.4g 424532 S  6.7  1.0   0:19.39 W#10-eth0
36571 root      20   0 6678872   2.4g 424532 S  6.7  1.0   0:19.16 W#11-eth0
36574 root      20   0 6678872   2.4g 424532 S  6.7  1.0   2:19.38 W#14-eth0
36585 root      20   0 6678872   2.4g 424532 S  6.7  1.0   0:18.78 W#25-eth0
36588 root      20   0 6678872   2.4g 424532 R  6.7  1.0   0:18.80 W#28-eth0
36592 root      20   0 6678872   2.4g 424532 S  6.7  1.0   2:15.00 W#32-eth0
36600 root      20   0 6678872   2.4g 424532 S  6.7  1.0   2:15.17 W#40-eth0
36606 root      20   0 6678872   2.4g 424532 S  6.7  1.0   2:16.28 W#46-eth0
36616 root      20   0 6678872   2.4g 424532 S  6.7  1.0   2:16.47 W#55-eth0
36617 root      20   0 6678872   2.4g 424532 S  6.7  1.0   2:16.64 W#56-eth0
36619 root      20   0 6678872   2.4g 424532 S  6.7  1.0   2:19.40 W#58-eth0
36491 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:16.35 Suricata-Main
36561 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:19.46 W#01-eth0
36563 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:19.14 W#03-eth0
36564 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:19.16 W#04-eth0
36565 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:19.05 W#05-eth0
36566 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:19.13 W#06-eth0
36568 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:19.07 W#08-eth0
36572 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:18.93 W#12-eth0
36573 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:18.88 W#13-eth0
36581 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:19.09 W#21-eth0
36589 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:19.25 W#29-eth0
36626 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:35.27 FM#01
36627 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:05.83 FR#01
36628 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:00.00 CW
36629 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:00.07 CS
36630 root      20   0 6678872   2.4g 424532 S  0.0  1.0   0:00.03 US^C

I am also confused about the polls all return timeout between two PACKET_STATISTICS.

Hi,

Thanks for your post.

Could you provide some details about

  • Host operating system and version (e.g., Ubuntu 21.0.4, output of uname -a)
  • Suricata version?
  • Command line used to invoke Suricata
  • Packet source method (e.g., AF-PACKET, …)
  • Network card capturing packets for Suricata
  • System details:
    • Physical/virtual?
    • CPU/cores available
    • Memory
  • Suricata configuration (please remove any values that are proprietary/sensitive and/or confidential)

It’s difficult to assess a situation when little is known about how you’re using Suricata. top is showing uneven CPU utilization on the W# (worker) threads which usually means packet distribution between is also – but not necessarily – unevenly distributed.

Thank you, some datails:

  • Centos7 3.10.0-957.21.2.el7.x86_64
  • Suricata 5.0.2 RELEASE
  • Packet source method: af-packet
  • Physical machine with 64 cores and 256G memory
  • Traffic rate: about 1.5Gbps

After increase max-pending-packets to 10000, and some memcap such as flow.memcap, stream.memcap, the situation is much better.

The traffic flow is light but you have a large machine so increasing the memcap values is ok.

You might try reducing the number of worker threads – an individual Suricata worker thread should be able to handle close to 1Gbps without dropping packets (this is traffic dependent of course as packet rate is equally as important as the traffic bit rate)

You should be able to get by with 4-6 worker threads (instead of 60) for the traffic rate you’re seeing.

Is the system dedicated to Suricata?

What type of network card?

Could you post the Suricata command invocation line?

Yes, it deteicatd to Suricata

we use eth0, eth1, eth2 for three different mirror traffic, the above is eth0 (ixgbe).

command line:
suricata -D -c /opt/app/suricata/suricata.yaml --af-packet

related configure:

 af-packet:
  - interface: eth0
    cluster-id: 99
    cluster-type: cluster_flow
    defrag: yes
    use-mmap: yes
    tpacket-v3: yes 
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 74:5a:aa:c5:92:05 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 74:5a:aa:c5:92:06 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 04:3f:72:b9:a5:32 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 04:3f:72:b9:a5:33 brd ff:ff:ff:ff:ff:ff


3b:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
        Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT
        Physical Slot: 3
        Flags: bus master, fast devsel, latency 0, IRQ 103, NUMA node 0
        Memory at ae000000 (64-bit, prefetchable) [size=32M]
        Expansion ROM at b1100000 [disabled] [size=1M]
        Capabilities: [60] Express Endpoint, MSI 00
        Capabilities: [48] Vital Product Data
        Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [c0] Vendor Specific Information: Len=18 <?>
        Capabilities: [40] Power Management version 3
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [180] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [1c0] #19
        Capabilities: [230] Access Control Services
        Kernel driver in use: mlx5_core
        Kernel modules: mlx5_core

3b:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
        Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT
        Physical Slot: 3
        Flags: bus master, fast devsel, latency 0, IRQ 243, NUMA node 0
        Memory at ac000000 (64-bit, prefetchable) [size=32M]
        Expansion ROM at b1000000 [disabled] [size=1M]
        Capabilities: [60] Express Endpoint, MSI 00
        Capabilities: [48] Vital Product Data
        Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [c0] Vendor Specific Information: Len=18 <?>
        Capabilities: [40] Power Management version 3
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [180] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [230] Access Control Services
        Kernel driver in use: mlx5_core
        Kernel modules: mlx5_core


d8:00.0 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Network Connection (rev 01)
        Physical Slot: 7
        Flags: bus master, fast devsel, latency 0, IRQ 113, NUMA node 1
        Memory at eec00000 (64-bit, prefetchable) [size=4M]
        I/O ports at d020 [size=32]
        Memory at ef204000 (64-bit, prefetchable) [size=16K]
        Expansion ROM at efc00000 [disabled] [size=4M]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [e0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number 74-5a-aa-ff-ff-c5-92-05
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe

d8:00.1 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Network Connection (rev 01)
        Physical Slot: 7
        Flags: bus master, fast devsel, latency 0, IRQ 178, NUMA node 1
        Memory at ee800000 (64-bit, prefetchable) [size=4M]
        I/O ports at d000 [size=32]
        Memory at ef000000 (64-bit, prefetchable) [size=16K]
        Expansion ROM at ef800000 [disabled] [size=4M]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [e0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number 74-5a-aa-ff-ff-c5-92-05
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe



Another test, we use eth1 that has large traffic about 5Gbps.

tcp.reassembly.memuse reach the memcap 150G, and lead a OOM. Is this memory usage normal?

And kenel.drops about 1~2%

The memuse issue is known Bug #4502: TCP reassembly memuse approaching memcap value results in TCP detection being stopped - Suricata - Open Information Security Foundation but I’m kinda confused that you see it with 5.0.2.
You can also see that when the memcap is hit that the tcp detection drops down.

1-2% droprate is not that unusual, depending on the setup. Is eth1 that will result in specific threads to do the major workload?
Can you check if this traffic has some bigger elephant flows? (try iftop on that interface for example)

Thank you.

As a general question, is this machine capable to handle 18Gbps with all three eth during peak traffic? :sweat_smile:

This depends on several factors and tuning, but yes such a system could handle that traffic rate if the traffic is clean and you don’t have elephant flows or shunt them/bypass them.

Seems the bug exists in 5.0.2. I have encountered this several times.