Manuals say modern Suricata supports decoding GRE/MPLS/etc by default. My config file uses all the CPU cores with normal traffic. However, during analyzing MPLS traffic only 1 core is 100% busy and all other are only 0-1% . stats.log rows with mpls show 0 everywhere, no drops also. I tried playing with set-cpu-affinity setting, but nothing has changed.
ifconfig shows that incoming traffic is huge, so there is definitely something to analyze. Suricata was installed from recommended repo.
Some alerts are being generated, yes, not like they are missing at all. However, in most cases reported payload looks like 235.a..b679.abd..EH.. .
So it seems, that MPLS decoding is not working as expected and is not efficient, as only 1 core is loaded, but for 100%. Please explain what do I need to configure to start decoding/analyzing MPLS packets properly and efficiently?
What version are you running exactly?
Please share your suricata.yaml as well and also the NIC used and how you run Suricata.
1 core being busy and the others aren’t could be related to the fact, that the NIC together with the kernel is not properly distributing the flows.
You can run ethtool -i eth1 but being XEN VM is a good point. Maybe there is something done by the NIC in virtualized mode that is the issue.
Or something with the interaction of the kernel in AF_PACKET mode in the XEN VM.
You can keep autofp if performance is enough for you, in general autofp is much slower compared to workers. So if you run into drops, you might have to rethink.
Also make sure all other stats are okay and validate if the events look like you expect them to be.
At least it proves that it’s related to the capture method.
I seems runmode: autofp will not work. Even with 24 cores now drop rate is huge:
capture.kernel_packets | Total | 38754672
capture.kernel_drops | Total | 34060040
NIC info in VM:
driver: vif
version:
firmware-version:
expansion-rom-version:
bus-info: vif-1
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no
I have no experience with those specific virtual NIC drivers.
Maybe it’s missing some relevant functionality.
Could you also show the output of ethtool -l and ethtool -x and when you switch back to workers also print out ethtool -S eth1 so we can see how the flow distribution is done at the NIC queue level.
Another helpful output would be sudo perf top -p ($pidof suricata).
So in this example you can see that all 4 queues are more or less equally busy with packets.
And this would have helped with narrowing it down.
The perf output is intersting, since the overhead for DetectAddressMatchIPv4 is quite high. You could also try to play around with different cluster_ modes but I doubt it will change much.
It would be helpful if you can somehow recreate it in a test environment where you can capture a pcap as well and share it. We would have to setup up such a setup to reproduce it or maybe someone else from the community has experience with that setup.
Maybe something is wrong in the MPLS traffic or maybe using something special which is valid but not well supported.
Another option would be to play around with other tools that use AF_PACKET and could show the flow distribution.
I would still think that this is an issue with the cooperation of the NIC (driver) + Kernel + Suricata. I’ve seen multiple MPLS setups working, so if you are able to create a test pcap we might be able to spot a crucial diff.