Capture packet missed (yet another capture.kernel_drops problem)

Tons of capture.kernel_drops

Hi I am new into suricata, Ive been reading and trying a lot of different configurations to solve the kernel_drops problem.

Suricata 7.0.9
installed with apt

commented out tpacket-v3 or using tpacket-v2 it only sees a few packets.

capture.kernel_packets | Total | 2840
and no capture.kernel_drops. but i have way more than 2840 packets every 30 secs.

with tpacket-v3: yes

capture.kernel_packets                        | Total                     | 2501158064
capture.kernel_drops                          | Total                     | 2422793577

I have 2 interfaces receiving SPAN traffic form 2 cisco nexus 9000 switches
I have around 5 Gbit/s on each interface (100GB interface)

/0/12e/3.1/0          sink          network        MT27800 Family [ConnectX-5]
/0/12e/3.1/0.1        sink2         network        MT27800 Family [ConnectX-5]

for debug ill try to just capture one interface.

Running on Ubuntu 22.04.5 LTS I’ve tried to install latest MLNX OFED drivers, I’ve compiled them ok on the server, and installed (including a Firmware update). Then the interfaces where not recognized anymore, just gone. I had to via BMC (another hw interface connection) uninstall the MLNX drivers and it came back. So not sure what else can be done in this aspect.
are those drivers needed to solve this issue?

Driviers loaded now:

:~# lsmod | grep mlx
mlx5_ib               393216  0
ib_uverbs             163840  1 mlx5_ib
ib_core               393216  2 ib_uverbs,mlx5_ib
mlx5_core            1597440  1 mlx5_ib
mlxfw                  32768  1 mlx5_core
psample                20480  1 mlx5_core
tls                   114688  2 bonding,mlx5_core
pci_hyperv_intf        16384  1 mlx5_core

the NIC is on NUMA node2

NUMA:
  NUMA node(s):           4
  NUMA node0 CPU(s):      0-3,16-19
  NUMA node1 CPU(s):      4-7,20-23
  NUMA node2 CPU(s):      8-11,24-27
  NUMA node3 CPU(s):      12-15,28-31

so that is 8-11. The 20-23 are SMT threads (2 threads per core).

  Model name:             AMD EPYC 7351P 16-Core Processor
    CPU family:           23
    Model:                1
    Thread(s) per core:   2

I have in total 32 cores and 125Gb of RAM

              total        used        free
Mem:           125Gi       8.8Gi       116Gi

sysctl params ive touched:

net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.all.arp_filter = 1
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.default.arp_ignore = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_mtu_probing = 1
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_rmem = 4096	87380	536870912
net.ipv4.tcp_low_latency = 1
net.ipv4.tcp_wmem = 4096	65536	536870912
net.ipv4.tcp_window_scaling = 1
net.ipv4.neigh.default.gc_thresh1 = 4096
net.ipv4.neigh.default.gc_thresh2 = 8192
net.ipv4.neigh.default.gc_thresh3 = 8192
net.ipv4.neigh.default.base_reachable_time = 30
net.ipv4.neigh.default.gc_stale_time = 86400
net.core.netdev_budget=600
net.core.netdev_budget_usecs=10000
net.core.optmem_max = 536870912
net.core.wmem_default = 1048576
net.core.wmem_max = 536870912
net.core.netdev_max_backlog = 250000
net.core.rmem_default = 1048576
net.core.rmem_max = 536870912
vm.max_map_count = 262144

Ethtool commands ive run:

#!/bin/bash

IFACES=("sink" "sink2")
QUEUE_COUNT=4
RX_USECS=200
RX_FRAMES=64

for IFACE in "${IFACES[@]}"; do
    echo "Configuring $IFACE..."
    ip link set $IFACE down
    ethtool -L $IFACE combined $QUEUE_COUNT
    ethtool -K $IFACE rxhash on ntuple on rx-checksum off tx-checksum-ip-generic off
    ethtool -X $IFACE equal $QUEUE_COUNT
    ethtool -A $IFACE rx off tx off
    ethtool -C $IFACE adaptive-rx off adaptive-tx off rx-usecs $RX_USECS rx-frames $RX_FRAMES
    ip link set $IFACE up
    IRQS=$(ls /sys/class/net/$IFACE/device/msi_irqs/ 2>/dev/null || echo "")
    if [ -n "$IRQS" ]; then
        for IRQ in $IRQS; do
            echo 8-11 > /proc/irq/$IRQ/smp_affinity_list
        done
    fi
    echo "Settings for $IFACE:"
    ethtool -k $IFACE | grep -E 'rx-checksum|tx-checksum|rxhash|ntuple'
    ethtool -l $IFACE
    ethtool -c $IFACE
done

Dropped at kernel level? 0.000X drop and 0.00X error. so it is the missed the problem.
If i understand well most of the packets (missed) suricata can not pick up.

:~# ip -s link show sink
7: sink: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 98:03:9b:a4:0c:dc brd ff:ff:ff:ff:ff:ff
    RX:      bytes     packets errors dropped    missed    mcast
    44258232518493 20795059636 322591    5898 454795599 18859935
    TX:      bytes     packets errors dropped   carrier  collsns
              1456          20      0       0         0        0
    altname enp67s0f0np0
:~# ip -s link show sink2
8: sink2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 98:03:9b:a4:0c:dd brd ff:ff:ff:ff:ff:ff
    RX:      bytes     packets errors dropped    missed    mcast
    29158407291056 17653020946 111337   31018 148343216 19404696
    TX:      bytes     packets errors dropped   carrier  collsns
               880          12      0       0         0        0
    altname enp67s0f1np1

lscpu:

:~# lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          43 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   32
  On-line CPU(s) list:    0-31
Vendor ID:                AuthenticAMD
  Model name:             AMD EPYC 7351P 16-Core Processor
    CPU family:           23
    Model:                1
    Thread(s) per core:   2
    Core(s) per socket:   16
    Socket(s):            1
    Stepping:             2
    Frequency boost:      enabled
    CPU max MHz:          2400.0000
    CPU min MHz:          1200.0000
    BogoMIPS:             4799.72
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflus
                          h mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_t
                          sc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf rapl pni pcl
                          mulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rd
                          rand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowpref
                          etch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwa
                          itx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx sm
                          ap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr a
                          rat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists
                           pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
                          sme sev sev_es
Virtualization features:
  Virtualization:         AMD-V
Caches (sum of all):
  L1d:                    512 KiB (16 instances)
  L1i:                    1 MiB (16 instances)
  L2:                     8 MiB (16 instances)
  L3:                     64 MiB (8 instances)
NUMA:
  NUMA node(s):           4
  NUMA node0 CPU(s):      0-3,16-19
  NUMA node1 CPU(s):      4-7,20-23
  NUMA node2 CPU(s):      8-11,24-27
  NUMA node3 CPU(s):      12-15,28-31
Vulnerabilities:
  Gather data sampling:   Not affected
  Itlb multihit:          Not affected
  L1tf:                   Not affected
  Mds:                    Not affected
  Meltdown:               Not affected
  Mmio stale data:        Not affected
  Reg file data sampling: Not affected
  Retbleed:               Mitigation; untrained return thunk; SMT vulnerable
  Spec rstack overflow:   Mitigation; safe RET
  Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:             Mitigation; Retpolines; IBPB conditional; STIBP disabled; RSB filling; PBRSB-e
                          IBRS Not affected; BHI Not affected
  Srbds:                  Not affected
  Tsx async abort:        Not affected

Suricata Stats:

:~# tail -n 110 /var/log/suricata/stats.log
------------------------------------------------------------------------------------
Counter                                       | TM Name                   | Value
------------------------------------------------------------------------------------
capture.kernel_packets                        | Total                     | 98128284
capture.kernel_drops                          | Total                     | 72877747
capture.afpacket.polls                        | Total                     | 26
capture.afpacket.poll_data                    | Total                     | 26
decoder.pkts                                  | Total                     | 16386891
decoder.bytes                                 | Total                     | 42078356944
decoder.ipv4                                  | Total                     | 16385451
decoder.ipv6                                  | Total                     | 2599
decoder.ethernet                              | Total                     | 16389482
decoder.arp                                   | Total                     | 1416
decoder.unknown_ethertype                     | Total                     | 16
decoder.tcp                                   | Total                     | 16359619
tcp.syn                                       | Total                     | 1408
tcp.synack                                    | Total                     | 1028
tcp.rst                                       | Total                     | 657
decoder.udp                                   | Total                     | 27686
decoder.icmpv4                                | Total                     | 163
decoder.icmpv6                                | Total                     | 2
decoder.geneve                                | Total                     | 2591
decoder.vlan                                  | Total                     | 16386883
decoder.avg_pkt_size                          | Total                     | 2567
decoder.max_pkt_size                          | Total                     | 9014
tcp.active_sessions                           | Total                     | 3284
flow.total                                    | Total                     | 20264
flow.active                                   | Total                     | 2403
flow.tcp                                      | Total                     | 4568
flow.udp                                      | Total                     | 15575
flow.icmpv4                                   | Total                     | 119
flow.icmpv6                                   | Total                     | 2
flow.tcp_reuse                                | Total                     | 3
flow.wrk.spare_sync_avg                       | Total                     | 99
flow.wrk.spare_sync                           | Total                     | 196
flow.wrk.spare_sync_incomplete                | Total                     | 14
decoder.event.ipv4.opt_pad_required           | Total                     | 140
flow.wrk.flows_evicted_needs_work             | Total                     | 1155
flow.wrk.flows_evicted_pkt_inject             | Total                     | 1348
flow.wrk.flows_evicted                        | Total                     | 1155
flow.wrk.flows_injected                       | Total                     | 1154
flow.wrk.flows_injected_max                   | Total                     | 4
tcp.sessions                                  | Total                     | 4497
tcp.ssn_from_cache                            | Total                     | 1105
tcp.ssn_from_pool                             | Total                     | 3392
tcp.pseudo                                    | Total                     | 2
tcp.midstream_pickups                         | Total                     | 3280
tcp.ack_unseen_data                           | Total                     | 5335742
tcp.segment_from_cache                        | Total                     | 31508
tcp.segment_from_pool                         | Total                     | 561215
tcp.stream_depth_reached                      | Total                     | 172
tcp.reassembly_gap                            | Total                     | 163414
tcp.overlap                                   | Total                     | 17131
detect.alert                                  | Total                     | 2862354
detect.alerts_suppressed                      | Total                     | 10228
app_layer.flow.http                           | Total                     | 171
app_layer.tx.http                             | Total                     | 238
app_layer.error.http.parser                   | Total                     | 2
app_layer.flow.tls                            | Total                     | 185
app_layer.error.tls.gap                       | Total                     | 21
app_layer.error.tls.parser                    | Total                     | 6
app_layer.flow.ssh                            | Total                     | 2
app_layer.flow.nfs_tcp                        | Total                     | 141
app_layer.tx.nfs_tcp                          | Total                     | 49675
app_layer.error.nfs_tcp.parser                | Total                     | 3
app_layer.flow.ntp                            | Total                     | 35
app_layer.tx.ntp                              | Total                     | 35
app_layer.tx.pgsql                            | Total                     | 1
app_layer.flow.failed_tcp                     | Total                     | 675
app_layer.flow.dns_udp                        | Total                     | 3638
app_layer.tx.dns_udp                          | Total                     | 5699
app_layer.flow.failed_udp                     | Total                     | 11902
flow.end.state.new                            | Total                     | 17485
flow.end.state.established                    | Total                     | 1
flow.end.state.closed                         | Total                     | 375
flow.end.tcp_state.syn_sent                   | Total                     | 693
flow.end.tcp_state.syn_recv                   | Total                     | 144
flow.end.tcp_state.time_wait                  | Total                     | 1
flow.end.tcp_state.last_ack                   | Total                     | 11
flow.end.tcp_state.close_wait                 | Total                     | 1
flow.end.tcp_state.closed                     | Total                     | 363
flow.end.tcp_liberal                          | Total                     | 11
flow.mgr.full_hash_pass                       | Total                     | 62
flow.mgr.rows_per_sec                         | Total                     | 104856
flow.spare                                    | Total                     | 57225
flow.mgr.rows_maxlen                          | Total                     | 2
flow.mgr.flows_checked                        | Total                     | 32688
flow.mgr.flows_notimeout                      | Total                     | 14315
flow.mgr.flows_timeout                        | Total                     | 18373
flow.mgr.flows_evicted                        | Total                     | 18375
flow.mgr.flows_evicted_needs_work             | Total                     | 1669
flow.recycler.recycled                        | Total                     | 16706
flow.recycler.queue_avg                       | Total                     | 1
flow.recycler.queue_max                       | Total                     | 17
tcp.memuse                                    | Total                     | 4980840
tcp.reassembly_memuse                         | Total                     | 156397025
http.memuse                                   | Total                     | 248219
flow.memuse                                   | Total                     | 77853664

Suricata details:starting logs:

[2424195 - Suricata-Main] 2025-04-17 14:03:14 Notice: suricata: This is Suricata version 7.0.9 RELEASE running in SYSTEM mode
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Info: cpu: CPUs/cores online: 32
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: device: Adding interface sink from config file
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: luajit: luajit states preallocated: 128
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Info: suricata: Setting engine mode to IDS mode by default
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: exception-policy: exception-policy: ignore (defined via 'built-in default' for IDS-mode)
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: exception-policy: app-layer.error-policy: ignore (defined via 'built-in default' for IDS-mode)
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: app-layer-htp: 'default' server has 'request-body-minimal-inspect-size' set to 33297 and 'request-body-inspect-window' set to 3935 after randomization.
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: app-layer-htp: 'default' server has 'response-body-minimal-inspect-size' set to 31812 and 'response-body-inspect-window' set to 3941 after randomization.
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: app-layer-ssl: no TLS config found, enabling TLS detection on port 443.
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: smb: read: max record size: 16777216, max queued chunks 64, max queued size 67108864
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: smb: write: max record size: 16777216, max queued chunks 64, max queued size 67108864
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: app-layer-enip: Protocol detection and parser disabled for enip protocol.
[2424195 - Suricata-Main] 2025-04-17 14:03:14 Config: app-layer-dnp3: Protocol detection and parser disabled for DNP3.
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: host: allocated 262144 bytes of memory for the host hash... 4096 buckets of size 64
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: host: preallocated 1000 hosts of size 136
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: host: host memory usage: 398144 bytes, maximum: 16777216
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Info: suricata: No 'host-mode': suricata is in IDS mode, using default setting 'sniffer-only'
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: exception-policy: defrag.memcap-policy: ignore (defined via 'built-in default' for IDS-mode)
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: defrag-hash: allocated 229376 bytes of memory for the defrag hash... 4096 buckets of size 56
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: defrag-hash: defrag memory usage: 229376 bytes, maximum: 16777216
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: exception-policy: flow.memcap-policy: ignore (defined via 'built-in default' for IDS-mode)
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: flow: flow size 296, memcap allows for 174120295 flows. Per hash row in perfect conditions 166
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream "prealloc-sessions": 2048 (per thread)
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream "memcap": 51539607552
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream "midstream" session pickups: enabled
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream "async-oneside": disabled
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream "checksum-validation": disabled
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: exception-policy: stream.memcap-policy: ignore (defined via 'built-in default' for IDS-mode)
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: exception-policy: stream.reassembly.memcap-policy: ignore (defined via 'built-in default' for IDS-mode)
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: exception-policy: stream.midstream-policy: ignore (defined via 'built-in default' for IDS-mode)
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream."inline": disabled
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream "bypass": disabled
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream.reassembly.urgent.policy": drop
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream "max-syn-queued": 10
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream "max-synack-queued": 5
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream.reassembly "memcap": 17179869184
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream.reassembly "depth": 2097152
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream.reassembly "toserver-chunk-size": 5233
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream.reassembly "toclient-chunk-size": 4920
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream.reassembly.raw: enabled
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp: stream.liberal-timestamps: disabled
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp-reassemble: stream.reassembly "segment-prealloc": 2048
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: stream-tcp-reassemble: stream.reassembly "max-regions": 8
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Warning: counters: global stats config is missing. Stats enabled through legacy stats.log. See https://docs.suricata.io/en/suricata-7.0.9/configuration/suricata-yaml.html#stats
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Info: logopenfile: eve-log output device (regular) initialized: eve.json
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: runmodes: enabling 'eve-log' module 'alert'
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Info: logopenfile: stats output device (regular) initialized: stats.log
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: landlock: Landlock is not enabled in configuration
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: suricata: Delayed detect disabled
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: detect: pattern matchers: MPM: hs, SPM: hs
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: detect: grouping: tcp-whitelist (default) 53, 80, 139, 443, 445, 1433, 3306, 3389, 6666, 6667, 8080
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: detect: grouping: udp-whitelist (default) 53, 135, 5060
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: detect: prefilter engines: MPM
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: reputation: IP reputation disabled
[2424197 - Suricata-Main] 2025-04-17 14:03:14 Config: detect: Loading rule file: /var/lib/suricata/rules/suricata.rules
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Info: detect: 1 rule files processed. 43029 rules successfully loaded, 0 rules failed, 0
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Info: threshold-config: Threshold config parsed: 0 rule(s) found
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Info: detect: 43032 signatures processed. 1257 are IP-only rules, 4333 are inspecting packet payload, 37225 inspect application layer, 109 are decoder event only
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Config: detect: building signature grouping structure, stage 1: preprocessing rules... complete
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: TCP toserver: 41 port groups, 40 unique SGH's, 1 copies
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: TCP toclient: 21 port groups, 21 unique SGH's, 0 copies
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: UDP toserver: 41 port groups, 40 unique SGH's, 1 copies
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: UDP toclient: 21 port groups, 17 unique SGH's, 4 copies
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: OTHER toserver: 254 proto groups, 4 unique SGH's, 250 copies
...
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: Builtin MPM "toserver UDP packet": 40
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: Builtin MPM "toclient UDP packet": 17
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: Builtin MPM "other IP packet": 3
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: AppLayer MPM "toserver http_uri (http)": 16
...
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: AppLayer MPM "toserver file_data (smtp)": 2
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: Pkt MPM "icmpv6.hdr": 1
[2424197 - Suricata-Main] 2025-04-17 14:03:26 Perf: detect: Pkt MPM "ipv6.hdr": 1
[2424197 - Suricata-Main] 2025-04-17 14:03:43 Config: af-packet: sink: enabling locked memory for mmap
[2424197 - Suricata-Main] 2025-04-17 14:03:43 Config: af-packet: sink: enabling tpacket v3
[2424197 - Suricata-Main] 2025-04-17 14:03:43 Config: af-packet: sink: using flow cluster mode for AF_PACKET
[2424197 - Suricata-Main] 2025-04-17 14:03:43 Config: af-packet: sink: using defrag kernel functionality for AF_PACKET
[2424197 - Suricata-Main] 2025-04-17 14:03:44 Info: runmodes: sink: creating 8 threads
[2424197 - Suricata-Main] 2025-04-17 14:03:44 Config: flow-manager: using 2 flow manager threads
[2424197 - Suricata-Main] 2025-04-17 14:03:44 Config: flow-manager: using 2 flow recycler threads
[2424336 - W#01-sink] 2025-04-17 14:03:44 Info: ioctl: sink: MTU 9000
...
[2424341 - W#06-sink] 2025-04-17 14:03:48 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used
[2424343 - W#07-sink] 2025-04-17 14:03:48 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used
[2424197 - Suricata-Main] 2025-04-17 14:03:48 Notice: threads: Threads created -> W: 8 FM: 2 FR: 2   Engine started.
...
[2424340 - W#05-sink] 2025-04-17 14:03:48 Info: checksum: No packets with invalid checksum, assuming checksum offloading is NOT used

and finally the suricata config:

# Suricata configuration file
%YAML 1.1
---
vars:
  address-groups:
    # HOME_NET defines the internal/trusted network
    HOME_NET: "[129.132.93.64/26,10.204.0.0/16, 10.205.0.0/16, 10.206.0.0/16]"
    # EXTERNAL_NET is everything that's not in HOME_NET
    EXTERNAL_NET: "!$HOME_NET"
    HTTP_SERVERS: "$HOME_NET"
    # DNS servers - Our Dns servers
    DNS_SERVERS: "[129.132.98.12,129.132.250.2]"
    # SSH servers - assuming they're in the internal network
    SSH_SERVERS: "$HOME_NET"
    # Additional server types needed for rules
    SMTP_SERVERS: "$HOME_NET"
    SQL_SERVERS: "$HOME_NET"
    TELNET_SERVERS: "$HOME_NET"


  port-groups:
    HTTP_PORTS: "80"
    SHELLCODE_PORTS: "!0"
    SSH_PORTS: "22"
    ORACLE_PORTS: "1521"

default-log-dir: /var/log/suricata/


rule-files:
  - /var/lib/suricata/rules/suricata.rules

app-layer:
  protocols:
    dnp3:
      enabled: no
      detection-enabled: no
    modbus:
      enabled: no
      detection-enabled: no

stream:
  memcap: 48Gb
  checksum-validation: no
  reassembly:
    memcap: 16Gb
    depth: 2mb
    toserver-chunk-size: 5120
    toclient-chunk-size: 5120
    urgent:
      policy: drop              # drop, inline, oob (1 byte, see RFC 6093, 3.1), gap
  midstream: true
  async-oneside: false

flow:
  memcap: 48Gb
  hash-size: 1048576
  prealloc: 30000
  emergency-recovery: 30
  managers: 2 # default to one flow manager
  recyclers: 2 # default to one flow recycler thread

flow-timeouts:
  default:
    new: 30                     #Time-out in seconds after the last activity in this flow in a New state.
    established: 300            #Time-out in seconds after the last activity in this flow in a Established state.
    emergency-new: 10           #Time-out in seconds after the last activity in this flow in a New state during the emergency mode.
    emergency-established: 100  #Time-out in seconds after the last activity in this flow in a Established state in the emergency mode.
  tcp:
    new: 60
    established: 400
    closed: 120
    emergency-new: 10
    emergency-established: 300
    emergency-closed: 20

vlan:
  use-for-tracking: true

af-packet:
  - interface: "sink"
    threads: 8
    cluster-type: cluster_flow
    cluster-id: 99
    defrag: yes
      #tpacket-v2: yes
    tpacket-v3: yes
    checksum-checks: auto
    #buffer-size: 8388608
    ring-size: 200000
    use-mmap: yes
    mmap-locked: yes
    #use-emergency-flush: yes
    #  - interface: "sink2"
    #threads: 8
    #cluster-type: cluster_flow
    #cluster-id: 100
    #defrag: yes
    #tpacket-v3: yes
    #checksum-checks: auto
    ##buffer-size: 8388608
    #ring-size: 200000
    #use-mmap: yes
    #mmap-locked: yes
    ##use-emergency-flush: yes

cpu-affinity:
  - management-cpu-set:
      cpu: [ 1-3 ]
      mode: balanced
      prio:
        default: low
  - receive-cpu-set:
      cpu: [ 8-11 ]
      prio:
        default: high
  - worker-cpu-set:
      cpu: [ 8-11, 24-27 ]

logging:
  default-log-level: info
  outputs:
    - console:
        enabled: yes
    - file:
        enabled: yes
        filename: /var/log/suricata/suricata.log
outputs:
  - fast:
      enabled: no
      filename: fast.log
  - eve-log:
      enabled: yes
      filetype: regular
      filename: eve.json
      types:
        - alert:
            tagged-packets: no
        # - flow:
        #     interval: 60
  - stats:
      enabled: yes
      filename: stats.log
      interval: 60

Can i better use my resources? so the CPU affinity is ok?

OFC then the alerts are all about stream being incomplete or broken:

1:/var/log/suricata# tail -n 50000 /var/log/suricata/eve.json | jq -r '.alert | "\(.signature_id) - \(.signature)"' | sort | uniq -c | sort -nr
  23637 2210045 - SURICATA STREAM Packet with invalid ack
  23632 2210029 - SURICATA STREAM ESTABLISHED invalid ack
   2713 2210056 - SURICATA STREAM bad window update
      7 2210010 - SURICATA STREAM 3way handshake wrong seq wrong ack
      6 2221010 - SURICATA HTTP unable to match response to request
      3 2210030 - SURICATA STREAM FIN invalid ack
      2 2210017 - SURICATA STREAM CLOSEWAIT invalid ACK

Ive tryed cluster_qm, ive tested really a lot of configs. it might be the traffic is simoply too much, and i need to pre filter?
i can also make the switch only fw a striped packet (just the headers, 100kb ?)
Ive played with the values here, but at this point iam not sure what works and what does not.

can someone give me a hint on what could be wrong here or there is just not enough HW capacity?

Update:

ive builded suricata from surce ebpf enabled.

added a BPF filter for the 2 more important vlans, filtered out nfs, lustre and other big elephants in the room. I have no more packet drops!.

also I’ve modified the affinity to:

cpu-affinity:
  - management-cpu-set:
      cpu: ["1-3" , "16-19", "12-13" ]
  - receive-cpu-set:
      cpu: [ "4-7", "20-23", "14-15" , "28-31" ]
  - worker-cpu-set:
      cpu: [ "8-11", "24-27" ]
      mode: "exclusive"
      prio:
        low: [ 0 ]
        medium: [ "1" ]
        high: [ "8-11", "24-27" ]
        default: "high"

Thing is now i dont have any alert on eve.
also the packet count is way too small, and to confirm that i did a namp -T 5 for the whole vlan basically and i dont get any alert. :confused:

stats: