Help how to debug stream drop invalid yes

Hi there,

I am relatively new to suricata (using it for a few weeks now, and I am pretty happy with it). I use Suricata as an IPS in KVM. My Host is using Openvswitch and I have two bridges ovs-host which contains the host-device as well as one device of the suricata vm and ovs-guests which contains the NICs of all guests as well as a NIC of the suricata vm. This VM then connects ovs-guest and ovs-host using af-packet (with eBPF lb). At first I had to use e1000 because I had trouble with VirtIO - I found a solution for that. Meanwhile I wrote a few tools which send all my fail2ban blocks to multiple Suricata’s which use iprep and drop accordingly. From what I can tell and see this setup works like a charm.

However. One of my VMs behind this suricata has wordpress and dokuwiki running. Both having trouble to connect back home - Wordpress displays it is unable to connect wordpress.org. Dokuwiki cannot reach their extensions/plugins. But everything I checked like wget, curl, dns… Works fine. Manually retrieving and accessing wordpress.org also works. The IPs are not dropped. In fact they are not even in fast.log. In eve.json I can see the dns requests to resolv and I can see that an answer is coming from wordpress.org but somehow then nothing happens.

The fix for all this was setting drop-invalid to no:

stream:
  memcap: 64mb
  memcap-policy: ignore
  drop-invalid: no
  checksum-validation: yes      # reject incorrect csums
  #midstream: false
  midstream-policy: ignore
  inline: auto                  # auto will use inline mode in IPS mode, yes or no set it statically
  reassembly:

As soon as I set this, everything works again. I tried to do some debugging, but all I found was this:

ips.drop_reason.flow_drop | Total | 837
ips.drop_reason.rules | Total | 3398
ips.drop_reason.stream_error | Total | 19347

I am not sure what are high values for drop_reason.stream_error But compared to the other numbers these seemed to be high. Later I enabled more details in stats.log and got this:

ips.accepted                                  | Total                     | 180267
ips.blocked                                   | Total                     | 19401
ips.drop_reason.flow_drop                     | Total                     | 1795
ips.drop_reason.rules                         | Total                     | 1718
ips.drop_reason.stream_error                  | Total                     | 15888
[..]
capture.afpacket.poll_timeout                 | Total                     | 103893
[..]
stream.fin_but_no_session                     | Total                     | 12508
stream.rst_but_no_session                     | Total                     | 2577
stream.pkt_invalid_ack                        | Total                     | 47
stream.pkt_broken_ack                         | Total                     | 21
stream.rst_invalid_ack                        | Total                     | 47
stream.pkt_spurious_retransmission            | Total                     | 14735

And I was thinking it has to do with the pkt_spurious_retransmission. But my debugging journey ended here.

So I would be happy,

  • if someone here could point me to some good guide
  • or would like to help me understand (especially for learning purposes!) how I could do further debugging - what I could check to find the cause for this.
  • or if someone already knows what’s happening, some explanation.

Suricata Version

7.0.2-1~bpo12+1

How did I install Suricata

apt-get install suricata -t bookworm-backports

OS/etc

Debian 12 aka Bookworm. Tested with three different Kernels. Currently using 6.5.0-0.deb12.4-cloud-amd64. However - Same problems with the default (6.1.x) and the non-cloud (-cloud) kernel.

Any other information I can provide?

Thanks in advance,
Jean

Hi there,

I still did not find a solution or better explanation for what I am observing. Anyone willing to explain me how to interpret the values I’m seeing? Especially interested in:

what stream.pkt_spurious_retransmission does
what drop-invalid drops (i.e. when is a packet invalid according to this setting?)

Thanks in advance

I think it’s most likely the ips.drop_reason.stream_error number is related to the stream.fin_but_no_session and stream.rst_but_no_session numbers. These mean that Suricata did not have a TcpSession on a flow when RST or FIN packets arrived. This can happen if these packets are unexpected but also if the flow manager has timed out an existing flow.

It’s pretty normal to see stream.pkt_spurious_retransmission, it’s not treated as an error. It just means we saw a data segment be sent before the already ACK’d window, which means it will not be used in reassembly anymore.

Wrt debugging this: perhaps you can capture a pcap of the relevant traffic, and inspect the pcap using the eve.stream output. This can per packet show you the stream engine’s internal state. Then you can compare that to Wireshark’s view of the pcap.