Functionality Sanity Check

Hello,

I’m having a little trouble determining if I’m hitting a resource constraint, a configuration issue (most likely) or unrealistic goals with my hardware.

The server is running Suricata 5.0.3 with Hyperscan on CentOS 8 and has 2 x Intel E5-2690 rev0 CPUs, 256GB RAM and 1x Intel 82599ES 10GbE card with 3 upstream fiber taps connected with what should realistically be less than 5Gb of traffic total.

In short, the issue I’m experiencing is this:
Regardless of whether I use the recommendations here (https://suricata.readthedocs.io/en/suricata-5.0.3/performance/packet-capture.html) or the high-performance recommendations forcing a symmetric Toeplitz hash, irq pinning and the recommended NIC config values in Suricata Documentation section 9.5, my fast.log is absolutely flooding over with invalid acks, packet out of window alerts and the like. The end result is a large percentage of the traffic is designated as invalid and not inspected, which is a problem, as you might imagine.

I have spent many days attempting to resolve this, diving into the high performance recommendations and tweaking this to the best of my ability. Just for grins, I backed up the config, disabled NIC offloading except for rx and tx checksums to set everything back to the recommendations and have the same result, only with a slightly higher dropped packet count.

In short, where should I go from here? What obvious thing am I missing?

Thank you in advance!

Traffic is tricky, but those stream-events are noisy in nearly all high traffic settings I’ve seen. What I would do is to dump that traffic and run it again so you can debug it more closely. Maybe the traffic is really invalid and broken.

Maybe the amount is not soo high in comparison to the whole traffic although it’s noisy, did you compare the stats.log as well?

Indeed, it is tricky. After additional significant troubleshooting, it appears that I need to work with our aggregator and tune the buffers a bit more before proceeding.
My main concern was that my hardware might not be up to the task, but this doesn’t appear to be the case.

Thank you!