I’m building a Grafana dashboard on top of metrics collected by Telegraf and stored on InfluxDB. Something which I’d like to share with the community once I have something worthwhile.
The purpose of this dashboard is not necessarily to show all the metrics, but just those that provide quick view of resource usage and problem situations (both at suricata and nic level)
What I have so far:
- Host total CPU usage %
- Host total RAM usage %
- Suricata process CPU usage %
- Suricata process RAM usage %
- Suricata internal stats: capture_errors, capture_kernel_drops, tcp_reassembly_gap (all threads totals)
- Ethtool stats: rx/tx dropped/errors
What else would you recommend, for a generic dashboard? I know NIC’s have specific counters reported on ethtool depending on brand/model, so ideally this would not include those.