Looking for some feedback on what I believe is an issue with Suricata unexpectedly dropping tls traffic. The company I work for uses Suricata in inline mode (only alert rules) with nfqueue/iptables (4 queues balanced with bypass) and fail2ban watching fast.log on our application server/appliance. Our application is essentially a webserver GUI that brokers external client/server connectivity using ssh tunneling.
Under normal application usage we haven’t historically seen issues with traffic interruptions or drops. However recently we’ve identified an issue with our API/SDK connections that seems to indicate dropping/blocking of traffic at the IDS. A persistent API/SDK connection that repeatedly touches our api endpoint will work fine for a limited period of time and then stop getting any responses. The eve.json logs show the initial ssl/tls connection attempts inbound which turn into drops shortly after under the same flow_id. This sequence of tls request and subsequent drops continue until Suricata is stopped or the API/SDK connection is hung up and we wait a small time to restart.
Unfortunately the eve.json logs show no alerts and the fast and alert-debug logs are empty. I verified the nfqueue isn’t getting overloaded, the fail2ban stats show no blocks, and other non ssl traffic from the same source IP can reach the server even when ssl/tls is blocked.
My planned next steps are to set the drop-invalid flag to false, and possibly disable all tls parsing just to see if that has any effect.
All in all we’re at a loss for what is causing the connection drop/hang and any advice on where to look would be greatly appreciated.
Test system stats:
CentOS Linux release 7.9.2009 (Core)
suricata-update version 1.1.3 (rev: ac3ddb2)
– Gene Crumpler