How to debug apparent memory leak in Suricata 6.0.0?

Hi there,

we have deployed Suricata 6.0.0 on one of our Internet-facing servers (x86_64, 8 threads) that is seeing ~ 1k packets peer second of traffic (so not very high throughput). Suricata is running based on the jasonish/suricata Docker image and attached to the primary network interface and configured with a Docker mem_limit of 2G.

For some - yet unknown - reason, the memory utilization increases by about 2 MB per second, so that the container eventually gets OOM-killed.

Do you have any advise how to debug what is actually causing the memory leak?

Thanks!

stats.log:

------------------------------------------------------------------------------------
Counter                                       | TM Name                   | Value
------------------------------------------------------------------------------------
capture.kernel_packets                        | Total                     | 2291239
capture.kernel_drops                          | Total                     | 1126683
decoder.pkts                                  | Total                     | 1164817
decoder.bytes                                 | Total                     | 871118926
decoder.ipv4                                  | Total                     | 1148808
decoder.ipv6                                  | Total                     | 16009
decoder.ethernet                              | Total                     | 1164817
decoder.tcp                                   | Total                     | 179463
decoder.udp                                   | Total                     | 981872
decoder.icmpv4                                | Total                     | 1742
decoder.icmpv6                                | Total                     | 1740
decoder.avg_pkt_size                          | Total                     | 747
decoder.max_pkt_size                          | Total                     | 1514
flow.tcp                                      | Total                     | 7186
flow.udp                                      | Total                     | 4745
flow.icmpv4                                   | Total                     | 13
flow.icmpv6                                   | Total                     | 26
flow.wrk.spare_sync_avg                       | Total                     | 100
flow.wrk.spare_sync                           | Total                     | 84
flow.wrk.flows_evicted_needs_work             | Total                     | 3575
flow.wrk.flows_evicted_pkt_inject             | Total                     | 4085
flow.wrk.flows_evicted                        | Total                     | 179
flow.wrk.flows_injected                       | Total                     | 3532
tcp.sessions                                  | Total                     | 6893
tcp.pseudo                                    | Total                     | 8
tcp.syn                                       | Total                     | 7154
tcp.synack                                    | Total                     | 4808
tcp.rst                                       | Total                     | 6634
tcp.reassembly_gap                            | Total                     | 280
tcp.overlap                                   | Total                     | 1
detect.alert                                  | Total                     | 162
app_layer.flow.http                           | Total                     | 208
app_layer.tx.http                             | Total                     | 210
app_layer.flow.smtp                           | Total                     | 2
app_layer.tx.smtp                             | Total                     | 2
app_layer.flow.tls                            | Total                     | 4405
app_layer.flow.ssh                            | Total                     | 4
app_layer.flow.ntp                            | Total                     | 843
app_layer.tx.ntp                              | Total                     | 854
app_layer.flow.sip                            | Total                     | 1
app_layer.tx.sip                              | Total                     | 1
app_layer.flow.failed_tcp                     | Total                     | 9
app_layer.flow.dcerpc_udp                     | Total                     | 6
app_layer.flow.dns_udp                        | Total                     | 3809
app_layer.tx.dns_udp                          | Total                     | 9470
app_layer.flow.failed_udp                     | Total                     | 86
flow.mgr.full_hash_pass                       | Total                     | 5
flow.spare                                    | Total                     | 10033
flow.mgr.rows_maxlen                          | Total                     | 3
flow.mgr.flows_checked                        | Total                     | 14906
flow.mgr.flows_notimeout                      | Total                     | 6300
flow.mgr.flows_timeout                        | Total                     | 8606
flow.mgr.flows_evicted                        | Total                     | 8664
flow.mgr.flows_evicted_needs_work             | Total                     | 3531
tcp.memuse                                    | Total                     | 1146880
tcp.reassembly_memuse                         | Total                     | 11542996
http.memuse                                   | Total                     | 3414
flow.memuse                                   | Total                     | 8450304

Suricata memory will grow the longer it runs but then should level off. What this number is will really depend on how many rules you are running as well as the traffic. Your traffic doesn’t look that high, so maybe its the rules. I don’t know any magic numbers, but something like lots of SMB traffic will be harder on memory than a network with only DNS traffic, given the same amount of data on the network.

Anyways, I’m running the 6.0 image with memory confined to 2G right now and I’ll see how it goes. My test location is not that exciting though. Mainly just wanted to say that I don’t think Docker will have anything to do with the memory.

Have you done any config changes to the suricata.yaml?
What is the output of:

suricata --dump-config |grep mem

?

I think jemalloc is a very good way to debug memory runaway scenarios. Jemalloc also works fine with Suricata running in a docker. Please note it’s important to have jemalloc compiled with --enable-prof. Please refer to this nice guide written by Victor.