After first starting Suricata up, everything runs fine for a few hours but eventually I get (seemingly) unrecoverable packet loss.
Monitoring the processes with perf top, rs_dns_state_get_tx' and
AppLayerDefaultGetTxIterator’ slowly creep up in overhead% and eventually overtake `DetectRun.part.16’. Once this happens, I start getting packet loss.
This was after 15hrs:
Samples: 2M of event ‘cycles’, 4000 Hz, Event count (approx.): 1171695080430 lost: 0/0 drop: 0/0
Overhead Shared Object Symbol
46.41% suricata [.] rs_dns_state_get_tx
28.55% suricata [.] AppLayerDefaultGetTxIterator
9.87% suricata [.] FlowGetProtoMapping
4.15% suricata [.] DetectRun.part.16
1.01% suricata [.] DetectEnginePktInspectionRun
0.91% suricata [.] DetectEngineInspectRulePacketMatches
0.73% suricata [.] rs_sip_state_get_tx
I have the stats output from the same time attached as text file. statsout.log (6.6 KB)
OS: CentOS8 stream
CPU: 2x Xeon E5-2699 v4 (88 HT cores)
RAM: 128GB
NIC: Napatech NE40E3-4
Data rate: 10gbps sustained (pushing bigFlows.pcap from tcpreplay.appneta.com through a packet broker to the napatech)
It’s slow to diagnose as it takes hours for the function to creep up to the top of the list in perf.