So if the system is running fine with just the ET rules it would be best to work on your custom rules.
You can use Rule Profiling and check for the impact like in 11.9. Rule Profiling — Suricata 8.0.0-dev documentation keep in mind that rule profiling being compiled and and enabled has an impact on performance, so don’t run that in production all the time. (Don’t mix it with the full profiling of packets which is a different profiling)
Ideally you can narrow it down to a small amount of rules that have a big impact on performance and try to improve those rules. Without knowing the exact rules it’s hard to tell where to improve. But yes, one bad rule can kill performance.
Mellanox Cards should be fine in general, you won’t see a huge diff with other cards like Intel or Napatech if the signature is the root cause since this is burden on the CPU and less on the NIC.
Also adding more cores or more performant cores might not solve the root cause.
Well it depends on how much workload the cores have to do, looks like they reach their limits. In htop you can configure an option to also show a column that indicates which core is used by the process, so you could check if something stands out for the process(es) running on core 11.