Lot of kernel drops using XDP driver under RHEL 8.3

HI all,

I am testing on Suricata 6.0.2 using the XDP driver but I reach a high number of kernel drops. Platform is a RHEL 8.3 KVM guest with the following build info:

This is Suricata version 6.0.2 RELEASE
Features: PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LIBJANSSON PROFILING PROFILE_LOCKING TLS TLS_C11 MAGIC RUST
SIMD support: none
Atomic intrinsics: 1 2 4 8 byte(s)
64-bits, Little-endian architecture
GCC version Clang 10.0.1 (Red Hat 10.0.1-1.module+el8.3.0+7459+90c24896), C version 201112
compiled with -fstack-protector
compiled with _FORTIFY_SOURCE=2
L1 cache line size (CLS)=64
thread local storage method: _Thread_local
compiled with LibHTP v0.5.37, linked against LibHTP v0.5.37

Suricata Configuration:
AF_PACKET support: yes
eBPF support: yes
XDP support: yes
PF_RING support: no
NFQueue support: no
NFLOG support: no
IPFW support: no
Netmap support: no
DAG enabled: no
Napatech enabled: no
WinDivert enabled: no

Unix socket enabled: yes
Detection enabled: yes

Libmagic support: yes
libnss support: yes
libnspr support: yes
libjansson support: yes
hiredis support: yes
hiredis async with libevent: yes
Prelude support: no
PCRE jit: yes
LUA support: yes
libluajit: no
GeoIP2 support: yes
Non-bundled htp: no
Hyperscan support: yes
Libnet support: yes
liblz4 support: yes

Rust support: yes
Rust strict mode: no
Rust compiler path: /usr/bin/rustc
Rust compiler version: rustc 1.47.0
Cargo path: /usr/bin/cargo
Cargo version: cargo 1.47.0
Cargo vendor: yes

Python support: yes
Python path: /usr/bin/python3
Python distutils yes
Python yaml yes
Install suricatactl: yes
Install suricatasc: yes
Install suricata-update: not bundled

Profiling enabled: yes
Profiling locks enabled: yes

Plugin support (experimental): yes

My capture config is:

af-packet:

  • interface: eth2
    #threads: auto
    cluster-id: 99
    cluster-type: cluster_qm
    xdp-mode: driver
    bypass: yes
    defrag: no
    use-mmap: yes
    mmap-locked: yes
    tpacket-v3: yes
    ring-size: 200000

But kernel drops numbers are really high (28%), in only 4 min:

Date: 4/4/2021 – 08:03:42 (uptime: 0d, 00h 04m 40s)

Counter | TM Name | Value

capture.kernel_packets | Total | 2189536
capture.kernel_drops | Total | 519816
decoder.pkts | Total | 1669717
decoder.bytes | Total | 2305053483
decoder.ipv4 | Total | 1669717
decoder.ethernet | Total | 1669717
decoder.tcp | Total | 1669681
decoder.udp | Total | 35
decoder.icmpv4 | Total | 1
decoder.avg_pkt_size | Total | 1380
decoder.max_pkt_size | Total | 1494
flow.tcp | Total | 37
flow.udp | Total | 18
flow.wrk.spare_sync_avg | Total | 100
flow.wrk.spare_sync | Total | 1
flow_bypassed.local_pkts | Total | 1665711
flow_bypassed.local_bytes | Total | 2302099509
flow.wrk.flows_evicted_needs_work | Total | 15
flow.wrk.flows_evicted_pkt_inject | Total | 21
flow.wrk.flows_evicted | Total | 2
flow.wrk.flows_injected | Total | 16
tcp.sessions | Total | 19
tcp.syn | Total | 19
tcp.synack | Total | 19
tcp.rst | Total | 8
tcp.stream_depth_reached | Total | 2
detect.mpm_list | Total | 1
detect.nonmpm_list | Total | 2328
detect.fnonmpm_list | Total | 1974
detect.match_list | Total | 1975
app_layer.flow.tls | Total | 19
app_layer.flow.ntp | Total | 6
app_layer.tx.ntp | Total | 6
app_layer.flow.dns_udp | Total | 11
app_layer.tx.dns_udp | Total | 22
app_layer.flow.failed_udp | Total | 1
flow.mgr.full_hash_pass | Total | 2
flow.spare | Total | 9904
flow.mgr.rows_maxlen | Total | 1
flow.mgr.flows_checked | Total | 38
flow.mgr.flows_notimeout | Total | 29
flow.mgr.flows_timeout | Total | 9
flow.mgr.flows_evicted | Total | 9
flow.mgr.flows_evicted_needs_work | Total | 4
tcp.memuse | Total | 573440
tcp.reassembly_memuse | Total | 98304
flow.memuse | Total | 7474304

Is the configuration correct? On the other hand, the xdp_filter.bpf file has not been created during the compilation process. Libbpf has been installed from the official Redhat repos:

Installed Packages
Name : libbpf-devel
Version : 0.0.8
Release : 4.el8
Architecture : x86_64
Size : 191 k
Source : libbpf-0.0.8-4.el8.src.rpm
Repository : @System
From repo : codeready-builder-for-rhel-8-x86_64-rpms
Summary : Development files for libbpf
License : LGPLv2 or BSD
Description : The libbpf-devel package contains libraries header files for
: developing applications that use libbpf

kernel version is 4.18.0-240.15.1.el8_3.x86_64 (latest available).

Suricata startup command is:

/opt/suricata/bin/suricata -c /etc/suricata/suricata.yaml -F /etc/suricata/bpf.conf -vv --af-packet -k none --pidfile /var/run/suricata/suricata.pid

Some ideas?

Sorry. More info: I am using virtio driver and nic hardware offloading is off:

Features for eth2:
rx-checksumming: off [fixed]
tx-checksumming: off
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: off
tx-scatter-gather: off [fixed]
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off [fixed]
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: off [fixed]
tx-tcp6-segmentation: off [fixed]
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
tls-hw-rx-offload: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]

Good morning,

Any news regarding this issue?

Two things come to my mind to try out to see if any effect:

1 - remove the bp filter
2 - enable hashing in the NIC (ntuple/rx-hash)

What is the output of ethtool -x interface_name

Hi,

Disabling my bpf filter:

9/4/2021 – 15:59:33 - - Registered 25904 rule profiling counters.
9/4/2021 – 15:59:34 - - Enabling locked memory for mmap on iface eth2
9/4/2021 – 15:59:34 - - Enabling tpacket v3 capture on iface eth2
9/4/2021 – 15:59:34 - - Using queue based cluster mode for AF_PACKET (iface eth2)
9/4/2021 – 15:59:34 - - Successfully loaded eBPF file ‘/etc/suricata/ebpf/xdp_filter.bpf’ on ‘eth2’
9/4/2021 – 15:59:34 - - eth2: enabling zero copy mode by using data release call
9/4/2021 – 15:59:34 - - Going to use 1 thread(s)
9/4/2021 – 15:59:34 - - using magic-file /usr/share/misc/magic
9/4/2021 – 15:59:34 - - using 1 flow manager threads
9/4/2021 – 15:59:34 - - using 1 flow recycler threads
9/4/2021 – 15:59:34 - - Running in live mode, activating unix socket
9/4/2021 – 15:59:34 - - Using unix socket file ‘/var/run/suricata/suricata-command.socket’
9/4/2021 – 15:59:34 - - all 1 packet processing threads, 4 management threads initialized, engine started.
9/4/2021 – 15:59:34 - - AF_PACKET V3 RX Ring params: block_size=32768 block_nr=10527 frame_size=1680 frame_nr=200013 (mem: 344948736)
9/4/2021 – 15:59:34 - - All AFP capture threads are running.
9/4/2021 – 16:02:40 - - Signal Received. Stopping engine.
9/4/2021 – 16:02:40 - - 0 new flows, 0 established flows were timed out, 0 flows in closed state
9/4/2021 – 16:02:41 - - time elapsed 187.783s
9/4/2021 – 16:02:42 - - 44 flows processed
9/4/2021 – 16:02:42 - - (W#01-eth2) Kernel: Packets 2209859, dropped 1216763
9/4/2021 – 16:02:42 - - Alerts: 1
9/4/2021 – 16:02:42 - - ippair memory usage: 414144 bytes, maximum: 16777216
9/4/2021 – 16:02:42 - - Done dumping profiling data.
9/4/2021 – 16:02:42 - - host memory usage: 398144 bytes, maximum: 33554432
9/4/2021 – 16:02:42 - - Dumping profiling data for 25904 rules.
9/4/2021 – 16:02:42 - - Done dumping profiling data.
9/4/2021 – 16:02:42 - - Done dumping keyword profiling data.
9/4/2021 – 16:02:42 - - Done dumping rulegroup profiling data.
9/4/2021 – 16:02:42 - - Done dumping prefilter profiling data.
9/4/2021 – 16:02:42 - - cleaning up signature grouping structure… complete
9/4/2021 – 16:02:42 - - Stats for ‘eth2’: pkts: 2209859, drop: 1216763 (55.06%), invalid chksum: 0
9/4/2021 – 16:02:42 - - Cleaning up Hyperscan global scratch
9/4/2021 – 16:02:42 - - Clearing Hyperscan database cache

stats.log:

Date: 4/9/2021 – 16:02:42 (uptime: 0d, 00h 03m 38s)

Counter | TM Name | Value

capture.kernel_packets | Total | 2209857
capture.kernel_drops | Total | 1216763
decoder.pkts | Total | 993091
decoder.bytes | Total | 1365062760
decoder.ipv4 | Total | 993072
decoder.ipv6 | Total | 1
decoder.ethernet | Total | 993091
decoder.tcp | Total | 993050
decoder.udp | Total | 23
decoder.avg_pkt_size | Total | 1374
decoder.max_pkt_size | Total | 1494
flow.tcp | Total | 73
flow.udp | Total | 11
flow.wrk.spare_sync_avg | Total | 100
flow.wrk.spare_sync | Total | 1
flow_bypassed.local_pkts | Total | 989398
flow_bypassed.local_bytes | Total | 1363121375
flow.wrk.flows_evicted_needs_work | Total | 23
flow.wrk.flows_evicted_pkt_inject | Total | 37
flow.wrk.flows_evicted | Total | 16
flow.wrk.flows_injected | Total | 24
tcp.sessions | Total | 28
tcp.syn | Total | 32
tcp.synack | Total | 28
tcp.rst | Total | 4
tcp.stream_depth_reached | Total | 1
tcp.reassembly_gap | Total | 10
tcp.overlap | Total | 42
detect.alert | Total | 1
detect.mpm_list | Total | 1
detect.nonmpm_list | Total | 2125
detect.fnonmpm_list | Total | 1812
detect.match_list | Total | 1813
app_layer.flow.tls | Total | 26
app_layer.flow.dns_udp | Total | 8
app_layer.tx.dns_udp | Total | 18
app_layer.flow.failed_udp | Total | 3
flow.mgr.full_hash_pass | Total | 1
flow.spare | Total | 9911
flow.mgr.rows_maxlen | Total | 1
flow.mgr.flows_checked | Total | 38
flow.mgr.flows_notimeout | Total | 24
flow.mgr.flows_timeout | Total | 14
flow.mgr.flows_evicted | Total | 14
flow.mgr.flows_evicted_needs_work | Total | 3
tcp.memuse | Total | 573440
tcp.reassembly_memuse | Total | 98304
flow.memuse | Total | 7474304

ethtool output:

root@rhelsuri01:/nsm/suricata# ethtool -x eth2
Cannot get RX ring count: Operation not supported

It is a virtio nic …

Can you try turning vlan tracking off ?

Sorry for this later answer. Problem was solved … it was a missconfiguration option enabled in our virtual host …

Many thanks for the help

Glad to hear it is solved!
Do you mind sharing (if possible of course) what the miss configuration was - might help someone else reading this threat.
Thank you

Yes, of course. I forgot to configure the following options for the guest virtio nic on the KVM host:

  <driver name='vhost' txmode='iothread' ioeventfd='on' event_idx='off' queues='8' rx_queue_size='1024' tx_queue_size='1024'>
    <host csum='off' gso='off' tso4='off' tso6='off' ecn='off' ufo='off' mrg_rxbuf='off'/>
    <guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
  </driver>