Suricata 6 --> High CPU usage while mem is good

Hi, I have some initial questions.
How much traffic are you processing?
Does the traffic peak around the time Suricata gets killed?
Out of curiosity, what kills Suricata due to CPU usage?

I would recommend having a look at 9.5. High Performance Configuration — Suricata 7.0.0-dev documentation and searching the forum for other drop related threads.

Thanks @syoc

  1. max 10GBps
  2. The kill happens at midnight US time, and we are an US focused Ecommerce site. Peaks are normally around 0900 and 1400 hours;
  3. This is what our HIDS (Wazuh) says. Didn’t find anything further:

2021-07-09T00:00:38.044763-04:00 suricata kernel: [212617.556063] Out of memory: Kill process 106269 (Suricata-Main) score 755 or sacrifice child

This sounds more like a memory issue instead of CPU.

  1. What Suricata version?
  2. What Linux Distri?
  3. Which NIC?
  4. Did you try hyperscan instead of ac-ks?
  5. Can you provide stats.log?
  6. Ideally also metrics of the differemt memuse parts of the stats over time
  1. Suricata 6.0.2.
  2. LSB Version: :core-4.1-amd64:core-4.1-noarch
    Distributor ID: CentOS
    Description: CentOS Linux release 7.9.2009 (Core)
    Release: 7.9.2009
    Codename: Core

image

  1. hyperscan is not supported on this system

5&6

https://drive.google.com/file/d/19GhAZTzn0u00vumzGmYHeFvMQF8G9OB3/view?usp=sharing

Hyperscan will give a huge lift on performance - this may not solve your immediate issues, but will help.

Rule profiling increases the CPU load – not significantly, but there is an impact.

AF-PACKET is configured to use 12 threads
image

You’re using workers mode – you don’t need to configure receive-cpu-set
worker-cpu-set has more cores allocated than threads from the af-packet configuration.
There’s only 2 managers – yet 9 cores reserved for the management-cpu-set. There is one flow manager and one flow recycler.

It would be better if you upload those files here, so everyone can help. Feel free to remove sensitive parts.

I can also encourage to run a test with hyperscan, it looks like a more recent CentOS so should be doable.
Especially from the perf top output it seems to be an issue with the mpm ac-ks

Thank you all for your feedback. This was helpful till a certain extend. Based on Suricata 6 with Hyperscan on CentOS 7 I have been able to install Hyperscan.

Mem begins low but CPU is at 100% and climbs up to a 1000+!

suricata.yaml (72.5 KB)
stats.log (226.9 KB)

Current suricata yaml and stats.log from when I turn it on till I kill the process.

screenshots from perf and htop:


How does the bpf filter look like? Can you try a run without it?
Can you also post suricata --build-info?
This mcount_internal at the top looks odd.

bpf filter =


UPDATE: Noticed an error here (duplicate IP), fixed that but issue remains.

Suricata --build-info result:

This is Suricata version 6.0.3 RELEASE
Features: PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LIBJANSSON TLS TLS_GNU MAGIC RUST
SIMD support: SSE_4_2 SSE_4_1 SSE_3
Atomic intrinsics: 1 2 4 8 16 byte(s)
64-bits, Little-endian architecture
GCC version 4.8.5 20150623 (Red Hat 4.8.5-44), C version 199901
compiled with -fstack-protector
compiled with _FORTIFY_SOURCE=2
L1 cache line size (CLS)=64
thread local storage method: __thread
compiled with LibHTP v0.5.38, linked against LibHTP v0.5.38

Suricata Configuration:
  AF_PACKET support:                       yes
  eBPF support:                            no
  XDP support:                             no
  PF_RING support:                         no
  NFQueue support:                         no
  NFLOG support:                           no
  IPFW support:                            no
  Netmap support:                          no
  DAG enabled:                             no
  Napatech enabled:                        no
  WinDivert enabled:                       no

  Unix socket enabled:                     yes
  Detection enabled:                       yes

  Libmagic support:                        yes
  libnss support:                          yes
  libnspr support:                         yes
  libjansson support:                      yes
  hiredis support:                         no
  hiredis async with libevent:             no
  Prelude support:                         no
  PCRE jit:                                yes
  LUA support:                             yes
  libluajit:                               no
  GeoIP2 support:                          yes
  Non-bundled htp:                         no
  Hyperscan support:                       yes
  Libnet support:                          yes
  liblz4 support:                          yes
  HTTP2 decompression:                     no

  Rust support:                            yes
  Rust strict mode:                        no
  Rust compiler path:                      /usr/bin/rustc
  Rust compiler version:                   rustc 1.53.0 (Red Hat 1.53.0-1.el7)
  Cargo path:                              /usr/bin/cargo
  Cargo version:                           cargo 1.53.0
  Cargo vendor:                            yes

  Python support:                          yes
  Python path:                             /usr/bin/python2.7
  Python distutils                         yes
  Python yaml                              yes
  Install suricatactl:                     yes
  Install suricatasc:                      yes
  Install suricata-update:                 yes

  Profiling enabled:                       no
  Profiling locks enabled:                 no

  Plugin support (experimental):           yes

Development settings:
  Coccinelle / spatch:                     no
  Unit tests enabled:                      no
  Debug output enabled:                    no
  Debug validation enabled:                no

Generic build parameters:
  Installation prefix:                     /usr
  Configuration directory:                 /etc/suricata/
  Log directory:                           /var/log/suricata/

  --prefix                                 /usr
  --sysconfdir                             /etc
  --localstatedir                          /var
  --datarootdir                            /usr/share

  Host:                                    x86_64-pc-linux-gnu
  Compiler:                                gcc (exec name) / g++ (real)
  GCC Protect enabled:                     yes
  GCC march native enabled:                yes
  GCC Profile enabled:                     yes
  Position Independent Executable enabled: no
  CFLAGS                                   -g -O2 -std=gnu99 -pg -march=native -I${srcdir}/../rust/gen -I${srcdir}/../rust/dist
  PCAP_CFLAGS
  SECCFLAGS                                -fstack-protector -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security

Adding a deep dive into the 80% from previous screenshot (perf):

Thanks again for looking @Andreas_Herz

Suricata’s multi-threaded – the CPU% can show over 100% as the value is the combined total of all threads’ CPU%

That I understood :slight_smile:

However in this case 100% CPU


is associate with the management.

Keep in mind that htop by default starts the count with 1 instead of 0 while the system and also Suricata use the cpu starting with 0. So I would argue that the worker threads are the busy ones.

Can you enable “show custom thread names” in htop and post the output?

AHA! I didn’t know that!

In my world counting starts at “0”

Still battling back and forth (too many hours spend :slight_smile: )

current situation is that NUMA1 was assigned to eth’s and NOT NUMA 0. All the while the CPU affinity was based on NUMA0.

This is a screenshot of: perf -C 3 -g -K:


Focusing on the mcount

did you run perf top with the pidof suricata?
I’ve never seen that high mcount symbols anywhere :confused:

You could try cluster_qm as described in 9.5. High Performance Configuration — Suricata 6.0.3 documentation just make sure that every cpu related affinity setting is correct and on the same numa node.

perf top result =
image

Could it be that libc-2.17 has issues with pinning the CPU’s?

Will work on the clustr_qm tomorrow.

Decided to go back to scratch and reinstall Suricata.
When running it stopped recognizing “Hyperscan” which I thought was odd.
Did a test on my hyperscan which came back without errors (based on hyperscan.io documentation).

Reinstalled Suricata again but added:

–with-libhs-includes=/usr/local/include/hs/ --with-libhs-libraries=/usr/local/lib
to:
sudo ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --enable-nfqueue --enable-lua

than did: sudo make && make install-full
AND sudo ldconfig.

Finger crossed but by now it seems to be working.

What needs to be done with hyperscan (besides running Suricata without rules to see the difference) is reinstalling Suricata to ensure all the connections are properly made.

Thanks all for the insights and assistance. After 24hours of running the amount of dropped packets is 0.3%

CPU usage is stable.

Glad it worked out, was a really strange case :slight_smile:

I saw your perf screenshot and that high --mccount figure was something that was bugging my environment as well.

I resolved that by running ./configure without the --enable-gccprofile switch. Not sure when that snuck into my build routine, but removing it made a significant difference. I’ve also updated the build instructions you referenced to omit that switch.