Measuring kernel capture drop rate

Hello,

Looking for some input/clarification.

I recently inherited a large installation of Suricata. I just updated to 6.0.1. I am curious about best way to measure kernel capture drop rate.

Suricata outputs a stats.log, and we also output stats.json that is used to feed into Splunk. Sample rate is every 30sec. The previous admin had setup to calculate the drop rate between the 30 sec sample in a Splunk Dashboard to monitor the platform.

previous_sample (t-30sec)
capture.kernel_packets
capture.kernel_drops

current_sample (current)
capture.kernel_packets
capture.kernel_drops

delta_packets = current_capture.kernel_packets - previous_capture.kernel_packets
delta_drops = current_capture.kernel_drops - previous_capture.kernel_drops

drop_rate = delta_drops / delta_packets

Is calculating the delta_drop rate this way a valid measure?

If I monitor the stats.log file, and calculate drop % I consistently see 1-2%. Strictly calculating from values in stats.log for each 30sec measure. capture.kernel_drops / capture.kernel_packets.

However if I calculate the delta drop % between stats.log measures I notice spikes in the drop rate that can be quite significant. (10-40%). As I started sifting through the stats.log I seem to only see the spikes when I calculate the delta between stats.log measures. But if I calculate strictly based on the capture.kernel_packets and the capture.kernel_drops values from each stats.log measure the drop % is 1-2%.

IMHO that is the correct approach. Do you see spikes there as well? I would correlate that to other values like load, network throughput etc. that might explain spikes.