High number kernel_drops again

version:“5.0.2-dev (b9515671b 2019-12-13)”
run as system service
my suricata drops lot of packets when i increase stream.reassembly.memcap.


stream:
memcap: 8gb
checksum-validation: yes # reject incorrect csums
inline: auto # auto will use inline mode in IPS mode, yes or no set it statically
reassembly:
memcap: 10gb
depth: 1mb # reassemble 1mb into a stream
toserver-chunk-size: 2560
toclient-chunk-size: 2560
randomize-chunk-size: yes

Date: 12/29/2020 – 12:01:38 (uptime: 0d, 00h 55m 59s)

Counter | TM Name | Value

capture.kernel_packets | Total | 112002445
capture.kernel_drops | Total | 37214768
decoder.pkts | Total | 74498141
decoder.bytes | Total | 48325625554
decoder.invalid | Total | 26
decoder.ipv4 | Total | 74177718
decoder.ipv6 | Total | 17697
decoder.ethernet | Total | 74498141
decoder.tcp | Total | 70136275
decoder.udp | Total | 3226298
decoder.icmpv4 | Total | 499878
decoder.icmpv6 | Total | 778
decoder.vlan | Total | 51202537
decoder.avg_pkt_size | Total | 648
decoder.max_pkt_size | Total | 65040
flow.tcp | Total | 1343887
flow.udp | Total | 683712
flow.icmpv4 | Total | 21412
flow.icmpv6 | Total | 324
decoder.event.ipv4.iplen_smaller_than_hlen | Total | 25
decoder.event.ipv4.opt_pad_required | Total | 542
decoder.event.icmpv4.unknown_type | Total | 7
decoder.event.icmpv4.unknown_code | Total | 44
decoder.event.ipv6.zero_len_padn | Total | 299
decoder.event.tcp.invalid_optlen | Total | 1
tcp.sessions | Total | 1083081
tcp.pseudo | Total | 8
tcp.invalid_checksum | Total | 8
tcp.syn | Total | 1590415
tcp.synack | Total | 1276354
tcp.rst | Total | 862490
tcp.pkt_on_wrong_thread | Total | 1399404
tcp.stream_depth_reached | Total | 3120
tcp.reassembly_gap | Total | 70854
tcp.overlap | Total | 12163622
detect.alert | Total | 3461
app_layer.flow.http | Total | 354063
app_layer.tx.http | Total | 541925
app_layer.flow.ftp | Total | 460
app_layer.tx.ftp | Total | 3940
app_layer.flow.smtp | Total | 6
app_layer.tx.smtp | Total | 8
app_layer.flow.tls | Total | 204234
app_layer.flow.ssh | Total | 1112
app_layer.flow.smb | Total | 1
app_layer.tx.smb | Total | 3
app_layer.flow.dcerpc_tcp | Total | 6
app_layer.flow.dns_tcp | Total | 7
app_layer.tx.dns_tcp | Total | 14
app_layer.flow.ntp | Total | 12065
app_layer.tx.ntp | Total | 13751
app_layer.flow.ftp-data | Total | 220
app_layer.flow.dhcp | Total | 183
app_layer.tx.dhcp | Total | 1757
app_layer.flow.snmp | Total | 14087
app_layer.tx.snmp | Total | 111594
app_layer.flow.failed_tcp | Total | 51482
app_layer.flow.dcerpc_udp | Total | 1453
app_layer.flow.dns_udp | Total | 374326
app_layer.tx.dns_udp | Total | 1299524
app_layer.flow.failed_udp | Total | 281598
flow_mgr.closed_pruned | Total | 757349
flow_mgr.new_pruned | Total | 1061741
flow_mgr.est_pruned | Total | 184094
flow.spare | Total | 10539
flow.tcp_reuse | Total | 920
flow_mgr.flows_checked | Total | 5980
flow_mgr.flows_notimeout | Total | 4748
flow_mgr.flows_timeout | Total | 1232
flow_mgr.flows_timeout_inuse | Total | 216
flow_mgr.flows_removed | Total | 1016
flow_mgr.rows_checked | Total | 65536
flow_mgr.rows_skipped | Total | 61659
flow_mgr.rows_empty | Total | 366
flow_mgr.rows_maxlen | Total | 6
tcp.memuse | Total | 78000560
tcp.reassembly_memuse | Total | 223701512
http.memuse | Total | 128032703
ftp.memuse | Total | 429369
app_layer.expectations | Total | 11
flow.memuse | Total | 24748096

if stream.reassembly.memcap scales to 256m ,the capture.kernel_drops would down even to zero.
the TCP reassembly gaps increases linely

top - 12:04:11 up 20 days, 1:48, 3 users, load average: 5.49, 5.30, 5.11
Tasks: 525 total, 1 running, 524 sleeping, 0 stopped, 0 zombie
%Cpu(s): 8.1 us, 0.6 sy, 1.9 ni, 88.8 id, 0.4 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 13166155+total, 17006872 free, 96304864 used, 18349816 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 33722224 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
59972 elastic+ 20 0 0.810t 0.028t 1.580g S 259.6 22.5 27468:12 java
240739 logstash 20 0 66.586g 0.057t 0.055t S 143.4 46.9 84:01.15 Suricata-Main
105217 logstash 39 19 12.291g 4.592g 29680 S 101.0 3.7 5866:38 java
262217 telegraf 20 0 2574540 29828 10004 S 10.3 0.0 235:20.14 telegraf
246344 root 20 0 47372 4328 3152 R 1.3 0.0 0:00.12 top
2813 mongodb 20 0 1121464 46796 0 S 0.7 0.0 193:12.16 mongod
1568 root 20 0 0 0 0 S 0.3 0.0 54:08.02 jbd2/dm-1-8
1 root 20 0 204808 3044 1244 S 0.0 0.0 0:18.00 systemd

NIC setting:

  • interface: eno3
    threads: auto
    cluster-id: 99
    cluster-type: cluster_flow
    defrag: yes
    use-mmap: yes
    mmap-locked: yes
    tpacket-v3: yes
    ring-size: 200000
    block-size: 1048576
  • interface: eno4
    threads: 48
    cluster-id: 100
    cluster-type: cluster_flow
    defrag: yes
    use-mmap: yes
    mmap-locked: yes
    tpacket-v3: yes
    ring-size: 200000
    block-size: 1048576

ifconfig eno4

eno4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::3a68:ddff:fe1c:422b prefixlen 64 scopeid 0x20
ether 38:68:dd:1c:42:2b txqueuelen 4000 (Ethernet)
RX packets 19760215198 bytes 11373818908711 (10.3 TiB)
RX errors 0 dropped 17724754 overruns 0 frame 216
TX packets 286 bytes 20256 (19.7 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

ethtool -l eno4

Channel parameters for eno4:
Pre-set maximums:
RX: 0
TX: 0
Other: 0
Combined: 128
Current hardware settings:
RX: 0
TX: 0
Other: 0
Combined: 1

memcap-list
Success:
[
{
“name”: “stream”,
“value”: “8gb”
},
{
“name”: “stream-reassembly”,
“value”: “10gb”
},
{
“name”: “flow”,
“value”: “1gb”
},
{
“name”: “applayer-proto-http”,
“value”: “3gb”
},
{
“name”: “defrag”,
“value”: “256mb”
},
{
“name”: “ippair”,
“value”: “16mb”
},
{
“name”: “host”,
“value”: “2gb”
}
]

by the way ,tcp.memuse always below 80M ,how can i increase the memcap?

What is the output of
ehtool --show-rxfh eno4 ?

root@SELKS:~# ethtool --show-rxfh eno4
RX flow hash indirection table for eno4 with 1 RX ring(s):
0: 0 0 0 0 0 0 0 0
8: 0 0 0 0 0 0 0 0
16: 0 0 0 0 0 0 0 0
24: 0 0 0 0 0 0 0 0
32: 0 0 0 0 0 0 0 0
40: 0 0 0 0 0 0 0 0
48: 0 0 0 0 0 0 0 0
56: 0 0 0 0 0 0 0 0
64: 0 0 0 0 0 0 0 0
72: 0 0 0 0 0 0 0 0
80: 0 0 0 0 0 0 0 0
88: 0 0 0 0 0 0 0 0
96: 0 0 0 0 0 0 0 0
104: 0 0 0 0 0 0 0 0
112: 0 0 0 0 0 0 0 0
120: 0 0 0 0 0 0 0 0
128: 0 0 0 0 0 0 0 0
136: 0 0 0 0 0 0 0 0
144: 0 0 0 0 0 0 0 0
152: 0 0 0 0 0 0 0 0
160: 0 0 0 0 0 0 0 0
168: 0 0 0 0 0 0 0 0
176: 0 0 0 0 0 0 0 0
184: 0 0 0 0 0 0 0 0
192: 0 0 0 0 0 0 0 0
200: 0 0 0 0 0 0 0 0
208: 0 0 0 0 0 0 0 0
216: 0 0 0 0 0 0 0 0
224: 0 0 0 0 0 0 0 0
232: 0 0 0 0 0 0 0 0
240: 0 0 0 0 0 0 0 0
248: 0 0 0 0 0 0 0 0
256: 0 0 0 0 0 0 0 0
264: 0 0 0 0 0 0 0 0
272: 0 0 0 0 0 0 0 0
280: 0 0 0 0 0 0 0 0
288: 0 0 0 0 0 0 0 0
296: 0 0 0 0 0 0 0 0
304: 0 0 0 0 0 0 0 0
312: 0 0 0 0 0 0 0 0
320: 0 0 0 0 0 0 0 0
328: 0 0 0 0 0 0 0 0
336: 0 0 0 0 0 0 0 0
344: 0 0 0 0 0 0 0 0
352: 0 0 0 0 0 0 0 0
360: 0 0 0 0 0 0 0 0
368: 0 0 0 0 0 0 0 0
376: 0 0 0 0 0 0 0 0
384: 0 0 0 0 0 0 0 0
392: 0 0 0 0 0 0 0 0
400: 0 0 0 0 0 0 0 0
408: 0 0 0 0 0 0 0 0
416: 0 0 0 0 0 0 0 0
424: 0 0 0 0 0 0 0 0
432: 0 0 0 0 0 0 0 0
440: 0 0 0 0 0 0 0 0
448: 0 0 0 0 0 0 0 0
456: 0 0 0 0 0 0 0 0
464: 0 0 0 0 0 0 0 0
472: 0 0 0 0 0 0 0 0
480: 0 0 0 0 0 0 0 0
488: 0 0 0 0 0 0 0 0
496: 0 0 0 0 0 0 0 0
504: 0 0 0 0 0 0 0 0
RSS hash key:
ab:71:b3:e3:a6:40:c0:3f:97:a0:ba:25:7c:57:ac:67:34:78:eb:4f:e3:43:12:e7:31:e2:d3:74:db:7f:38:9d:50:6e:86:be:54:01:e2:d8:e7:00:a8:d2:66:19:b5:19:1a:53:03:69
RSS hash function:
toeplitz: on
xor: off

This is strange - there is only one RX ring eno4 with 1 RX ring from the output above, think there should be more. What NIC/kernel is that ?

root@SELKS:~# ethtool -i eno4
driver: i40e
version: 1.6.16-k
firmware-version: 4.10 0x80001b6f 1.2203.0
expansion-rom-version:
bus-info: 0000:09:00.3
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

root@SELKS:~# uname -a
Linux SELKS 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1 (2020-01-20) x86_64 GNU/Linux

I think it is set by the “StamusNetworks)/SELKS” ,which presented the NIC initial script "selks-disable-interface-offloading_stamus“


interface=$1
ARGS=1 # The script requires 1 argument.

echo -e “\n The supplied network interface is : ${interface} \n”;

if [ $# -ne “$ARGS” ];
then
echo -e “\n USAGE: basename $0 -> the script requires 1 argument - a network interface!”
echo -e “\n Please supply a network interface. Ex - ./disable-interface-offloading eth0 \n”
exit 1;
fi

/sbin/ethtool -G {interface} rx 4096 >/dev/null 2>&1 ; for i in rx tx sg tso ufo gso gro lro rxvlan txvlan ntuple rxhash; do /sbin/ethtool -K {interface} $i off >/dev/null 2>&1; done;

/sbin/ethtool -A {interface} rx off tx off >/dev/null 2>&1; #/sbin/ip link set {interface} promisc on up >/dev/null 2>&1;
/sbin/ethtool -C {interface} rx-usecs 1 rx-frames 0 >/dev/null 2>&1; /sbin/ethtool -L {interface} combined 1 >/dev/null 2>&1;
/sbin/ethtool -C ${interface} adaptive-rx off >/dev/null 2>&1;

echo -e “###################################”
echo -e “# CURRENT STATUS - NIC OFFLOADING #”
echo -e “###################################”
/sbin/ethtool -k {interface} echo -e "######################################" echo -e "# CURRENT STATUS - NIC RINGS BUFFERS #" echo -e "######################################" /sbin/ethtool -g {interface}

considering your doubt,I chage the setting as follow:

ifconfig eno4 down
/usr/local/sbin/ethtool -L eno4 combined 24
/usr/local/sbin/ethtool -K eno4 rxhash on
/usr/local/sbin/ethtool -K eno4 ntuple on
ifconfig eno4 up
/usr/local/sbin/ethtool -X eno4 hkey 6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A equal 24
/usr/local/sbin/ethtool -A eno4 rx off
/usr/local/sbin/ethtool -C eno4 adaptive-rx off adaptive-tx off rx-usecs 125
/usr/local/sbin/ethtool -G eno4 rx 2048
/usr/local/sbin/ethtool -X eno4 hfunc toeplitz
for proto in tcp4 udp4 tcp6 udp6; do
/usr/local/sbin/ethtool -N eno4 rx-flow-hash $proto sdfn
done

af-packet:

  • interface: eno4
    threads: 48
    cluster-id: 100
    cluster-type: cluster_qm
    defrag: yes
    use-mmap: yes
    mmap-locked: yes
    tpacket-v3: yes
    ring-size: 200000
    block-size: 1048576

but i think it’s insignificant,neither the cpu nor the mem could continuously catch overloading

Date: 1/4/2021 – 12:34:02 (uptime: 0d, 00h 30m 08s)

Counter | TM Name | Value

capture.kernel_packets | Total | 51975448
capture.kernel_drops | Total | 13695123
decoder.pkts | Total | 38070251
decoder.bytes | Total | 23120514863
decoder.invalid | Total | 20
decoder.ipv4 | Total | 37921374
decoder.ipv6 | Total | 9893
decoder.ethernet | Total | 38070251
decoder.tcp | Total | 35281656
decoder.udp | Total | 2130568
decoder.icmpv4 | Total | 334929
decoder.icmpv6 | Total | 439
decoder.vlan | Total | 27872749
decoder.avg_pkt_size | Total | 607
decoder.max_pkt_size | Total | 65040
flow.tcp | Total | 686550
flow.udp | Total | 361292
flow.icmpv4 | Total | 11885
flow.icmpv6 | Total | 199
defrag.ipv4.fragments | Total | 1157
defrag.ipv4.reassembled | Total | 552
decoder.event.ipv4.iplen_smaller_than_hlen | Total | 19
decoder.event.ipv4.opt_pad_required | Total | 325
decoder.event.ipv6.zero_len_padn | Total | 136
decoder.event.tcp.hlen_too_small | Total | 1
decoder.event.ipv4.frag_overlap | Total | 6
tcp.sessions | Total | 564222
tcp.pseudo | Total | 4
tcp.invalid_checksum | Total | 2
tcp.syn | Total | 1051044
tcp.synack | Total | 717926
tcp.rst | Total | 495607
tcp.pkt_on_wrong_thread | Total | 464995
tcp.stream_depth_reached | Total | 1426
tcp.reassembly_gap | Total | 23079
tcp.overlap | Total | 5908727
detect.alert | Total | 6779
app_layer.flow.http | Total | 181938
app_layer.tx.http | Total | 253669
app_layer.flow.ftp | Total | 104
app_layer.tx.ftp | Total | 932
app_layer.flow.smtp | Total | 1
app_layer.tx.smtp | Total | 2
app_layer.flow.tls | Total | 118541
app_layer.flow.ssh | Total | 504
app_layer.flow.dcerpc_tcp | Total | 72
app_layer.flow.dns_tcp | Total | 18
app_layer.tx.dns_tcp | Total | 42
app_layer.flow.enip_tcp | Total | 1
app_layer.tx.enip_tcp | Total | 114
app_layer.flow.ntp | Total | 6758
app_layer.tx.ntp | Total | 7664
app_layer.flow.ftp-data | Total | 39
app_layer.flow.dhcp | Total | 85
app_layer.tx.dhcp | Total | 1093
app_layer.flow.snmp | Total | 6326
app_layer.tx.snmp | Total | 61397
app_layer.flow.failed_tcp | Total | 66123
app_layer.flow.dcerpc_udp | Total | 541
app_layer.flow.dns_udp | Total | 197116
app_layer.tx.dns_udp | Total | 694403
app_layer.flow.failed_udp | Total | 150466
flow_mgr.closed_pruned | Total | 430964
flow_mgr.new_pruned | Total | 497324
flow_mgr.est_pruned | Total | 76505
flow.spare | Total | 10124
flow.tcp_reuse | Total | 342
flow_mgr.flows_checked | Total | 5341
flow_mgr.flows_notimeout | Total | 4396
flow_mgr.flows_timeout | Total | 945
flow_mgr.flows_timeout_inuse | Total | 127
flow_mgr.flows_removed | Total | 818
flow_mgr.rows_checked | Total | 65536
flow_mgr.rows_skipped | Total | 62251
flow_mgr.rows_empty | Total | 321
flow_mgr.rows_maxlen | Total | 7
tcp.memuse | Total | 78000320
tcp.reassembly_memuse | Total | 219583312
http.memuse | Total | 98181414
ftp.memuse | Total | 98949
app_layer.expectations | Total | 8
flow.memuse | Total | 26432376

ps:

root@SELKS:/var/log/suricata# /usr/local/sbin/ethtool --show-rxfh eno4
RX flow hash indirection table for eno4 with 24 RX ring(s):
0: 0 1 2 3 4 5 6 7
8: 8 9 10 11 12 13 14 15
16: 16 17 18 19 20 21 22 23
24: 0 1 2 3 4 5 6 7
32: 8 9 10 11 12 13 14 15
40: 16 17 18 19 20 21 22 23
48: 0 1 2 3 4 5 6 7
56: 8 9 10 11 12 13 14 15
64: 16 17 18 19 20 21 22 23
72: 0 1 2 3 4 5 6 7
80: 8 9 10 11 12 13 14 15
88: 16 17 18 19 20 21 22 23
96: 0 1 2 3 4 5 6 7
104: 8 9 10 11 12 13 14 15
112: 16 17 18 19 20 21 22 23
120: 0 1 2 3 4 5 6 7
128: 8 9 10 11 12 13 14 15
136: 16 17 18 19 20 21 22 23
144: 0 1 2 3 4 5 6 7
152: 8 9 10 11 12 13 14 15
160: 16 17 18 19 20 21 22 23
168: 0 1 2 3 4 5 6 7
176: 8 9 10 11 12 13 14 15
184: 16 17 18 19 20 21 22 23
192: 0 1 2 3 4 5 6 7
200: 8 9 10 11 12 13 14 15
208: 16 17 18 19 20 21 22 23
216: 0 1 2 3 4 5 6 7
224: 8 9 10 11 12 13 14 15
232: 16 17 18 19 20 21 22 23
240: 0 1 2 3 4 5 6 7
248: 8 9 10 11 12 13 14 15
256: 16 17 18 19 20 21 22 23
264: 0 1 2 3 4 5 6 7
272: 8 9 10 11 12 13 14 15
280: 16 17 18 19 20 21 22 23
288: 0 1 2 3 4 5 6 7
296: 8 9 10 11 12 13 14 15
304: 16 17 18 19 20 21 22 23
312: 0 1 2 3 4 5 6 7
320: 8 9 10 11 12 13 14 15
328: 16 17 18 19 20 21 22 23
336: 0 1 2 3 4 5 6 7
344: 8 9 10 11 12 13 14 15
352: 16 17 18 19 20 21 22 23
360: 0 1 2 3 4 5 6 7
368: 8 9 10 11 12 13 14 15
376: 16 17 18 19 20 21 22 23
384: 0 1 2 3 4 5 6 7
392: 8 9 10 11 12 13 14 15
400: 16 17 18 19 20 21 22 23
408: 0 1 2 3 4 5 6 7
416: 8 9 10 11 12 13 14 15
424: 16 17 18 19 20 21 22 23
432: 0 1 2 3 4 5 6 7
440: 8 9 10 11 12 13 14 15
448: 16 17 18 19 20 21 22 23
456: 0 1 2 3 4 5 6 7
464: 8 9 10 11 12 13 14 15
472: 16 17 18 19 20 21 22 23
480: 0 1 2 3 4 5 6 7
488: 8 9 10 11 12 13 14 15
496: 16 17 18 19 20 21 22 23
504: 0 1 2 3 4 5 6 7
RSS hash key:
6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a:6d:5a
RSS hash function:
toeplitz: on
xor: off

root@SELKS:/var/log/suricata# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 48
On-line CPU(s) list: 0-47
Thread(s) per core: 2
Core(s) per socket: 12
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel® Xeon® Gold 5118 CPU @ 2.30GHz
Stepping: 4
CPU MHz: 2300.000
BogoMIPS: 4600.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 16896K
NUMA node0 CPU(s): 0-11,24-35
NUMA node1 CPU(s): 12-23,36-47
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqd
q dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb invpcid_single ssbd ibrs ibpb stibp kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp_epp pku ospke md_clear flush_l1d

You might want to look deeper into the traffic and run tools like iftop, perf top to see if there is a specific traffic (for example elephant flows) what causes those drops.