Unable to get TCP traffic to flow between proxmox bridges, using Suricata AF_PACKET IPS mode bridge

Hey hey people, I’m using Suricata 7.0.2 RELEASE on Ubuntu 22.04. I have compiled suricata from source, using an automated build script I put together. The configure flags are:

--enable-lua --enable-geoip --enable-hiredis --enable-dpdk

I would also like to include a picture of my network topology to better envision what it is I’m trying to do.

With that all out of the way, I have a proxmox server. On that server, I have four linux bridges:

  • vmbr0 - bridged to my physical network 10.0.0.0/24, for internet access
  • vmbr1 - not bridged to any physical proxmox interface. assigned to 172.16.20.0/24 via pfSense DHCP services
  • vmbr2 - not bridged to any physical proxmox interface. Assigned to 172.16.21.0/24 via pfSense DHCP services
  • vmbr3 - not bridged to any physical proxmox interface. Assigned to 172.16.21.0/24 – more on this in a minute.

the pfSense VM sits between vmbr0, vmbr1, and vmbr2, providing routing, firewall, DHCP, and DNS services.

I created an Ubuntu 22.04 VM named IPS. Its management interface (ens18) that I use for network connectivity/patching/management is on the vmbr1 linux bridge. I was able to install the OS, perform updates, retrieve packages, meanwhile ens19 and ens20 are configured as AFPACKET bridge interfaces, and in inline mode.

Here is the af-packet.yaml I made and included in my suricata.yaml file for my suricata daemon:

%YAML 1.1
---
af-packet:
  - interface: ens19
    threads: 1
    defrag: yes
    cluster-type: cluster_flow
    cluster-id: 98
    copy-mode: ips
    copy-iface: ens20
    buffer-size: 64535
    use-mmap: yes
  - interface: ens20
    threads: 1
    cluster-id: 97
    defrag: yes
    cluster-type: cluster_flow
    copy-mode: ips
    copy-iface: ens19
    buffer-size: 64535
    use-mmap: yes

and here are the options I am using to run the suricata daemon:

-c /usr/local/etc/suricata/suricata.yaml -D --user=suricata --afpacket -k none

ens19 and ens20 are configured with the NOARP and PROMISC flags on daemon startup via the ip link command.

tcp checksum offloading is also disabled on both interfaces via ethtool before the daemon starts.

The purpose of suricata in this network diagram is demonstrate fail-closed networking, and to provide IDS services for traffic traversing between the vmbr2 and vmbr3 bridges. This setup and configuration has worked fine on a number of other hypervisors – virtualbox, vmware workstation, vmware fusion, ESXi but for some reason, it isn’t working as I expect with proxmox.

In the past, I managed to make this work, but I don’t remember what I did. This setup sort-of works. I don’t know where the failure is occuring. Here are my symptoms:

  • when I place a test VM (Ubuntu 22.04) in the vmbr3 network, I can:
  • retrieve an IP address via DHCP. This wouldn’t be possible if the AFPACKET bridge wasn’t working.
  • I can resolve IP addresses. for example nslookup google.com on my test VM works fine
  • I can ping DNS names. for example ping google.com resolves to an IP address, sends the ECHO REQUEST, and gets the ECHO RESPONSE packet from the internet.

Of course, this is where problems occur:

TCP-based communications fail entirely. I used tcpdump on the test VM, and on my pfSense VM and I noticed:

  • The pfSense VM sees the SYN packets from the 172.16.21.11 VM (test vm), and also sees the SYN/ACK response from hosts on the internet

  • The test VM does NOT see the SYN/ACK response. I’m completely baffled as to why that is. This is the problem I’m trying to solve.

Here are things I have done already:

  • Proxmox has a firewall that is default enabled on the Domain, Node, and individual VM level. I have fully disabled these firewalls to ensure that proxmox isn’t dropping the connections for some reason.

  • All three virtual machines (pfSense, IPS, Test VM) are using e1000 drivers. I noticed strange behavior when using the virtIO drivers.

  • I have turned off checksum offloading on the pfSense VM. Additionally, all three network interfaces have a basic allow any/any firewall rule configured to ensure the pfsense firewall is NOT hindering traffic. Also, all rules that intentionally block RFC1918 networks are disabled. Remember that DHCP to the pfSense VM, as well as DNS requests and ICMP traffic out to the internet all work perfectly fine.

  • Some have suggested that I make double sure that both iptables and ebtables on the proxmox host have default ACCEPT policies, and that the rules have been flushed. I can confirm via iptables -L and ebtables -L that the policies are set to ACCEPT and there are ZERO rules present

  • Some have suggested that I change some sysctl tunables. I have modified:

    • net.bridge.bridge-nf-call-arptables = 0
    • net.bridge.bridge-nf-call-iptables = 0
    • net.ipv4.ip_forward = 1 on both the proxmox node, AND the IPS vm.

Here is /etc/network/interfaces of my proxmox node:

Here is ip a output from my proxmox node:

Let me know if you need anything else. Thank anyone who offers their assistance, and Happy new year.

-Tony

Not an answer, but does changing the driver help, see: IP packet handling issues in virtio-net on certain OS/kernel versions on KVM VM - #9 by chani

1 Like

sweet googly moogly, it worked!

I had all of the network cards set to e1000 already, and so I set the except-policy to ignore in suricata.yaml and it worked! I have been wracking my brain over this for the past 24 hours.

I might sleep peacefully tonight. My sincerest thanks for your suggestion.

1 Like

I did a little more experimentation after I confirmed that the resetting the exception policy fixed things:

  • only the inline bridge interfaces of the IPS VM must be using the e1000 drivers
  • I tried experimenting with settings mentioned in another post to see if I could get virtIO to work, but unfortunately, these config changes followed by changing the NICs back to virtIO resulted in loss of network connectivity. No idea why virtIO reacts so badly to AFPACKET bridging, but here we are.

If you manage to add the settings I added in libvirt to the XML file to your proxmox KVM guests (lookup the documentation for qm.conf and check the file in /etc/pve/qemu-server/VMID.conf. You may be able to pass those settings somehow there.

See: IP packet handling issues in virtio-net on certain OS/kernel versions on KVM VM - #9 by chani

However, I am not sure how to add these settings exactly; the nic Param does not allow it. Maybe using “args: …”. I verified that my fix works on another different host as well. But I’m using a libvirt-stack instead of Proxmox.

However, there might be an option superior to the above. And that one works with proxmox. Pass-through the NIC to the guest. Assuming you have a separate management link to your proxmox host this might be even performance-wise a better way to go.