Suricata high capture.kernel_drops count

Souji_T · August 4, 2020, 7:08am

Hello,

I am new to suricata, and I wanted to set it as IDS up.
However, I am getting quite a lot of capture.kernel_drops and I don’t know what to do about it.

capture.kernel_packets                       | Total                     | 689554867
capture.kernel_drops                         | Total                     | 5910262

As for my system, I am running Suricata 5.0.3 on an CentOS 8.2 with 70Gi RAM and 16 Cores.

I changed following settings in the suricata.yml file:

threads: 10
cluster-type: cluster_qm
use-mmap: yes
ring-size: 300000
encryption-handling: bypass
host-mode: sniffer-only
max-pending-packets: 2048
runmode: autofp
default-packet-size: 3028

I also incresed the memcaps of the defrag, flow, stream(reassembly memcap 10gb) and host to 2gb.

I also tryed to adjust the cpu-affinity setting looking at this:

threading:
  set-cpu-affinity: yes
  cpu-affinity:
    - management-cpu-set:
        cpu: [ "1","2","3" ]  # include only these CPUs in affinity settings
    - receive-cpu-set:
        cpu: [ "0" ]  # include only these CPUs in affinity settings
    - worker-cpu-set:
        cpu: [ "4","5","6","7","8","9","10","11","12","13" ]
        mode: "exclusive"
        prio:
          low: [ 0 ]
          medium: [ "1" ]
          high: [ "4-13" ]
          default: "high"

Can someone help me, I actually have no clue what else I could try to solve my problem.

Andreas_Herz · August 4, 2020, 7:19am

Hi,

what hardware is that and especially what NIC?
Depending on the NIC there are some additional steps to do.

Also it’s encryption-handling: while you have a typo encryption-handeling:.

Another suggestion is to use runmode workers instead of autofp, especially if you use cluster_qm type an dalso do set it in the affinity.

Even more important would be to know the type of traffic and what rate you see?
Also the amount of rules active?

But it’s even under 1% drop, so could be a spike or elephant flow.

Souji_T · August 4, 2020, 7:47am

The NIC is an Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Card

I corrected the miss type in my question^^

I had the runmode worker at first and changed it yesterday to autofs to see if it might help… I will change it back.

In the moment, I am only using the ET ruleset with about 19132 loaded rules.
There is all different kind of traffic. We deployed the IDS at the “main” vlan in which all the user-PCs are. It is mostly DNS, HTTP and SMB traffic.

vjulien · August 4, 2020, 6:48pm

Can you share a full stats.log record from when it drops traffic?

Souji_T · August 5, 2020, 6:56am

Yesterday I changed my setting back from autofp to workers and let Suricata run for about 23 hours. What I noticed there is that I got a much higher drop rate from about 5%, I also had 15% somewhat after 4 hours.

It seems to me that autofp gives much better performance for my setup. Therefore, I might try to use autofp and change my cluster-type and so on, to work better with autofp.

Here is the latest entry of my stats.log:

------------------------------------------------------------------------------------
Date: 8/5/2020 -- 08:40:38 (uptime: 0d, 23h 07m 23s)
------------------------------------------------------------------------------------
Counter                                       | TM Name                   | Value
------------------------------------------------------------------------------------
capture.kernel_packets                        | Total                     | 1212461422
capture.kernel_drops                          | Total                     | 60955119
decoder.pkts                                  | Total                     | 1151505376
decoder.bytes                                 | Total                     | 1168493087321
decoder.ipv4                                  | Total                     | 1148304120
decoder.ipv6                                  | Total                     | 376699
decoder.ethernet                              | Total                     | 1151505376
decoder.tcp                                   | Total                     | 1104398468
decoder.udp                                   | Total                     | 42814320
decoder.icmpv4                                | Total                     | 1169220
decoder.icmpv6                                | Total                     | 4834
decoder.vlan                                  | Total                     | 697926590
decoder.vxlan                                 | Total                     | 2
decoder.avg_pkt_size                          | Total                     | 1014
decoder.max_pkt_size                          | Total                     | 1514
flow.tcp                                      | Total                     | 2517348
flow.udp                                      | Total                     | 3989883
flow.icmpv4                                   | Total                     | 2732
flow.icmpv6                                   | Total                     | 1306
defrag.ipv4.fragments                         | Total                     | 293834
defrag.ipv4.reassembled                       | Total                     | 145886
decoder.event.ipv4.opt_pad_required           | Total                     | 141
decoder.event.ipv6.zero_len_padn              | Total                     | 790
tcp.sessions                                  | Total                     | 1605811
tcp.pseudo                                    | Total                     | 1153004
tcp.syn                                       | Total                     | 3070944
tcp.synack                                    | Total                     | 1237993
tcp.rst                                       | Total                     | 2163387
tcp.pkt_on_wrong_thread                       | Total                     | 8025758
tcp.stream_depth_reached                      | Total                     | 904
tcp.reassembly_gap                            | Total                     | 8176
tcp.overlap                                   | Total                     | 13812
tcp.insert_list_fail                          | Total                     | 814
detect.alert                                  | Total                     | 2306475
app_layer.flow.http                           | Total                     | 346641
app_layer.tx.http                             | Total                     | 398476
app_layer.flow.smtp                           | Total                     | 422
app_layer.tx.smtp                             | Total                     | 851
app_layer.flow.tls                            | Total                     | 185482
app_layer.flow.ssh                            | Total                     | 47
app_layer.flow.smb                            | Total                     | 8072
app_layer.tx.smb                              | Total                     | 292243
app_layer.flow.dcerpc_tcp                     | Total                     | 11207
app_layer.flow.dns_tcp                        | Total                     | 10806
app_layer.tx.dns_tcp                          | Total                     | 41675
app_layer.flow.nfs_tcp                        | Total                     | 1
app_layer.tx.nfs_tcp                          | Total                     | 422
app_layer.flow.ntp                            | Total                     | 20748
app_layer.tx.ntp                              | Total                     | 35747
app_layer.flow.tftp                           | Total                     | 2
app_layer.tx.tftp                             | Total                     | 4
app_layer.flow.krb5_tcp                       | Total                     | 21123
app_layer.tx.krb5_tcp                         | Total                     | 21094
app_layer.flow.dhcp                           | Total                     | 529
app_layer.tx.dhcp                             | Total                     | 4901
app_layer.flow.snmp                           | Total                     | 231646
app_layer.tx.snmp                             | Total                     | 4952265
app_layer.flow.failed_tcp                     | Total                     | 56782
app_layer.flow.dcerpc_udp                     | Total                     | 23
app_layer.flow.dns_udp                        | Total                     | 2695795
app_layer.tx.dns_udp                          | Total                     | 6106081
app_layer.flow.krb5_udp                       | Total                     | 593
app_layer.tx.krb5_udp                         | Total                     | 475
app_layer.flow.failed_udp                     | Total                     | 1040547
flow_mgr.closed_pruned                        | Total                     | 651371
flow_mgr.new_pruned                           | Total                     | 3255558
flow_mgr.est_pruned                           | Total                     | 2554899
flow.spare                                    | Total                     | 1048658
flow.tcp_reuse                                | Total                     | 31381
flow_mgr.flows_checked                        | Total                     | 381
flow_mgr.flows_notimeout                      | Total                     | 301
flow_mgr.flows_timeout                        | Total                     | 80
flow_mgr.flows_timeout_inuse                  | Total                     | 1
flow_mgr.flows_removed                        | Total                     | 79
flow_mgr.rows_checked                         | Total                     | 1048576
flow_mgr.rows_skipped                         | Total                     | 1048154
flow_mgr.rows_empty                           | Total                     | 55
flow_mgr.rows_maxlen                          | Total                     | 3
tcp.memuse                                    | Total                     | 5734400
tcp.reassembly_memuse                         | Total                     | 983040
flow.memuse                                   | Total                     | 427304032

Andreas_Herz · August 11, 2020, 7:36pm

What traffic rate do you see in a vague amount of Gbit/s?
Could you share the two config files you use one in the autofp, one in the worker runmode case?

Souji_T · August 12, 2020, 6:42am

I only see an average amount of 380 MBit/s this goes up to multiple GBit if our backup is running.
The one I used with the workers runmode: suricata_worker.yaml (68.9 KB)
And the one I am using currently with the autofp runmode: suricata_autofp.yaml (68.9 KB)

I also noticed that I am dropping some packets at my network interface, this could be the Problem if I am not misstaken. Imagine I have a flow started and I would drop one packet of the flow on my interface, then the kernel would drop all packets of this one flow, which occurs before, and after the one, I dropped on my Interface. Correct me if I am wrong though.

enp5s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 00:1b:21:a0:75:d5  txqueuelen 1000  (Ethernet)
        RX packets 20093747692  bytes 20704977495284 (18.8 TiB)
        RX errors 0  dropped 12329481  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Andreas_Herz · August 12, 2020, 9:43am

Can you narrow it down to those timeframes where the backups are happening that the drops increase? Elephant Flows are a typical case where drops occur.

You could try cluster_flow in the workers mode.

And those NIC drops can also be caused by those Elephant Flows. Your NIC uses the ixgbe driver, right?
If you want to use cluster_qm you should also ensure that you enable symmetric hashing and some other optimizations. If you have a rather new ethtool version and drivers try this:

ethtool -L enp5s0f1 combined 10
 ethtool -X enp5s0f1 hkey 6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A equal 10
ethtool -K enp5s0f1 rxhash on
ethtool -K enp5s0f1 ntuple on
for i in rx tx tso gso gro lro tx-nocache-copy sg txvlan rxvlan; do ethtool -K enp5s0f1 $i off; done
for proto in tcp4 udp4 tcp6 udp6; do ethtool -N enp5s0f1rx-flow-hash $proto sdfn; done
ethtool -C enp5s0f1 adaptive-rx off rx-usecs 62
ethtool -G enp5s0f1 rx 1024
/usr/local/bin/set_irq_affinity 4-13 enp5s0f1

Some suggest to use sd instead of sdfn and other values are also worth playing around.

Souji_T · August 13, 2020, 7:44am

Ok, I changed back to the workers runmode and set the cluster_type to cluster_flow.

In addition, today I will monitor the drops and try to figure out if they mainly happens when backups are running.
Moreover, I will look into the network card configuration and see if I can tweak some options there.

Souji_T · August 14, 2020, 6:50am

Ok. Yesterday I did not had so many droped packages, so that I am still below one percent of dropped packages.
Although I had, some drops while backup times, I dropped the most around noon.
If my drops will not go through the roove and stay somewhere below 1% it might be ok I think.

Can I change the ring-size parameter in the suricata.yml file to reduce drops? How I understand it might reduce them if I’m decreasing the variable. Right now, it is set to 300000, and I would try to change it to 150000 or 200000 and see the effects.

pevma · August 14, 2020, 7:01am

I would not recommend increasing the ring-size, I think at this point the problem might be elsewhere.
I looked at the provided workers yaml. It does not seem you are using AFPv3 (might have missed that in the conversation, sorry if i did ).

I suggest you should try commenting the following in the af-packet section of yaml and give it a run:

    mmap-locked: yes
    # Use tpacket_v3 capture mode, only active if use-mmap is true
    # Don't use it in IPS or TAP mode as it causes severe latency
    tpacket-v3: yes

Souji_T · August 14, 2020, 7:22am

I would rather decrease than increase the ring-size. If I am not mistaken, the original value is somewhat of 2000 or so and I changed it for whatever reason to 300000.

Never the less I will try your suggestion.

Regarding AFPv3, do you mean af_pacet version 3, I am new to this kind of stuff, so I do not really know…

pevma · August 14, 2020, 10:00am

Ok - i thought it was already at 300k , if it is just 2000 it is a good idea to increase it yes.

Souji_T · August 18, 2020, 7:10am

Soo it seems like that was the trick. The drops are so much less now. Thanks for the Tip

Now I have the problem that I can’t start Suricata with systemd…
In order to work with the changed config I had to change the amount of memory a process can lock. Because it looks like by default it is locked to 64kb like described in this post.

I changed the limits for the root user and suricata user to unlimited. However, when I try to start suricata with systemd it fails with following error:
[ERRCODE: SC_ERR_MEM_ALLOC(1)] - Unable to mmap, error Resource temporarily unavailable

I can start suricata as daemon from the cli with the -D switch, so it is not that urgent, but maybe you had some similar issue at some point or have an idea how to fix it, I would be glad to hear it^^

pevma · August 19, 2020, 6:07am

Yes, forgot to mention, sorry - for AFPv3 , it is this switch that needs to be un-commented -

github.com

OISF/suricata/blob/master/suricata.yaml.in#L607


# In some fragmentation cases, the hash can not be computed. If "defrag" is set
# to yes, the kernel will do the needed defragmentation before sending the packets.
defrag: yes
# To use the ring feature of AF_PACKET, set 'use-mmap' to yes
#use-mmap: yes
# Lock memory map to avoid it being swapped. Be careful that over
# subscribing could lock your system
#mmap-locked: yes
# Use tpacket_v3 capture mode, only active if use-mmap is true
# Don't use it in IPS or TAP mode as it causes severe latency
#tpacket-v3: yes
# Ring size will be computed with respect to "max-pending-packets" and number
# of threads. You can set manually the ring size in number of packets by setting
# the following value. If you are using flow "cluster-type" and have really network
# intensive single-flow you may want to set the "ring-size" independently of the number
# of threads:
#ring-size: 2048
# Block size is used by tpacket_v3 only. It should set to a value high enough to contain
# a decent number of packets. Size is in bytes so please consider your MTU. It should be
# a power of 2 and it must be multiple of page size (usually 4096).
#block-size: 32768

to make it explicitly use AFPv3

Try commenting out the mmap lock -

github.com

OISF/suricata/blob/master/suricata.yaml.in#L604


# Recommended modes are cluster_flow on most boxes and cluster_cpu or cluster_qm on system
# with capture card using RSS (requires cpu affinity tuning and system IRQ tuning)
cluster-type: cluster_flow
# In some fragmentation cases, the hash can not be computed. If "defrag" is set
# to yes, the kernel will do the needed defragmentation before sending the packets.
defrag: yes
# To use the ring feature of AF_PACKET, set 'use-mmap' to yes
#use-mmap: yes
# Lock memory map to avoid it being swapped. Be careful that over
# subscribing could lock your system
#mmap-locked: yes
# Use tpacket_v3 capture mode, only active if use-mmap is true
# Don't use it in IPS or TAP mode as it causes severe latency
#tpacket-v3: yes
# Ring size will be computed with respect to "max-pending-packets" and number
# of threads. You can set manually the ring size in number of packets by setting
# the following value. If you are using flow "cluster-type" and have really network
# intensive single-flow you may want to set the "ring-size" independently of the number
# of threads:
#ring-size: 2048
# Block size is used by tpacket_v3 only. It should set to a value high enough to contain

and see if any difference? (and make sure you are not running out of mem ?

)

Souji_T · August 19, 2020, 6:54am

How mentioned in my previous post, I already did that
After two days I am still below 0.0% dropped packages, meaning what you proposed had a huge positive effect.

Now the only problem is that after uncommenting the two options (tpacket-v3: yes and mmap-locked: yes), I cannot start suricata with this command: systemctl start suricata.service.

That is the full error in the log file.

Aug 19 08:24:36 ebjen-ids suricata[379107]: 19/8/2020 -- 08:24:36 - <Notice> - all 10 packet processing threads, 4 management threads initialized, engine started.
Aug 19 08:24:36 ebjen-ids suricata[379107]: 19/8/2020 -- 08:24:36 - <Error> - [ERRCODE: SC_ERR_MEM_ALLOC(1)] - Unable to mmap, error Resource temporarily unavailable
Aug 19 08:24:36 ebjen-ids suricata[379107]: 19/8/2020 -- 08:24:36 - <Error> - [ERRCODE: SC_ERR_AFP_CREATE(190)] - Couldn't init AF_PACKET socket, fatal error
Aug 19 08:24:36 ebjen-ids suricata[379107]: 19/8/2020 -- 08:24:36 - <Error> - [ERRCODE: SC_ERR_FATAL(171)] - thread W#01-enp5s0f1 failed
Aug 19 08:24:36 ebjen-ids systemd[1]: suricata.service: Main process exited, code=exited, status=1/FAILURE
Aug 19 08:24:36 ebjen-ids systemd[1]: suricata.service: Failed with result 'exit-code'.

For me it looks like when suricata is unable to lock memory map, when started as system service.

Never the less it is not that big of a Problem right now, because how point out in the preview post I am able to run suricata in daemon-mode if I enter the command manually, it is just not as nice as running it as service.

pevma · August 19, 2020, 7:20am

Yes it seems so - can you try to comment “#mmap-locked: yes”.

Souji_T · August 19, 2020, 7:38am

If I do this: #mmap-locked; yes
I can start it as service, without any errors.

Andreas_Herz · August 20, 2020, 9:45pm

What kernel is this? That’s a bit strange that it’s not working.

Souji_T · August 21, 2020, 6:13am

@pevma After almost 2 days of running suricata with tpacket-v3 enabled and mmap-locked disabled, I can say that the packet drop is not as high as it was to be at the beginning of this thread, but also not as low as when mmap-locked was enabled.

@Andreas_Herz Currently I use this Kernel: 4.18.0-193.6.3.el8_2.x86_64
There is a minor update to the 4.18.0-193.14.2.el8_2 Kernel available.

Edit:
I saw I never mentioned my Suricata version… I have the 5.0.3 release installed.