AF_Packets using only one capture thread

Hi all!

We are trying to setup a high performance Suricata instance in IPS/TAP mode. We started by using AF_PACKET mode with only one thread, since as it is explained in the documentation, to use multiple threads the load balancing algorithm should be changed.

We are now trying to configure eBPF load balancing to be able use multiple threads. We manage to compile Suricata 6.0.2 with eBPF support and to load the eBPF LB filter in the AF_PACKET mode.

By checking the load spread among the worker nodes, it seems the eBPF load balancing is working. The problem is that only one capture thread is receiving packages. Checking the stats:

We also changed the runmode from workers to autofp, since by what we understand from the documentation autofp mode should be used so the loadbalancing of the packets is performed by the capture threads.

Our current AF_PACKET config:

  af-packet:
    - interface: ens192
      threads: 2
      defrag: no
      cluster-type: cluster_ebpf
      ebpf-lb-file: /usr/libexec/suricata/ebpf/lb.bpf
      cluster-id: 98
      copy-mode: tap
      copy-iface: ens224
      buffer-size: 131072
      use-mmap: yes
      mmap-locked: yes
    - interface: ens224
      threads: 2
      cluster-id: 97
      defrag: no
      cluster-type: cluster_ebpf
      ebpf-lb-file: /usr/libexec/suricata/ebpf/lb.bpf
      copy-mode: tap
      copy-iface: ens192
      buffer-size: 131072
      use-mmap: yes
      mmap-locked: yes
2 Likes

How does your lb.bpf look like and the rest of the config, and how do you start suricata?

Did you try af_packet with workers mode and just cluster_flow or cluster_qm mode?

Hello @Andreas_Herz , thank you very much for your response.

How can I check the lb.bpf file? I have bpftool working but it is new to me and can’t find how to check the lb.bpf program with it.

-rw-r--r-- 1 root root 1.9K Mar 26 10:35 /usr/libexec/suricata/ebpf/lb.bpf

We start suricata with suricata --af-packet -vvv

The af-packet config is as follows:

af-packet:
    - interface: ens192
      threads: 8
      defrag: no
      cluster-type: cluster_ebpf
      ebpf-lb-file: /usr/libexec/suricata/ebpf/lb.bpf
      cluster-id: 98
      copy-mode: tap
      tpacket-v3: no
      copy-iface: ens224
      buffer-size: 131072
      use-mmap: yes
      mmap-locked: yes
    - interface: ens224
      threads: 8
      cluster-id: 97
      defrag: no
      cluster-type: cluster_ebpf
      ebpf-lb-file: /usr/libexec/suricata/ebpf/lb.bpf
      copy-mode: tap
      tpacket-v3: no
      copy-iface: ens192
      buffer-size: 131072
      use-mmap: yes
      mmap-locked: yes

The full suricata.yaml:

  %YAML 1.1
  ---
  vars:
    address-groups:
      HOME_NET: "[172.16.9.0/24]"

      EXTERNAL_NET: "!$HOME_NET"

      HTTP_SERVERS: "$HOME_NET"
      SMTP_SERVERS: "$HOME_NET"
      SQL_SERVERS: "$HOME_NET"
      DNS_SERVERS: "$HOME_NET"
      TELNET_SERVERS: "$HOME_NET"
      AIM_SERVERS: "$EXTERNAL_NET"
      DC_SERVERS: "$HOME_NET"
      DNP3_SERVER: "$HOME_NET"
      DNP3_CLIENT: "$HOME_NET"
      MODBUS_CLIENT: "$HOME_NET"
      MODBUS_SERVER: "$HOME_NET"
      ENIP_CLIENT: "$HOME_NET"
      ENIP_SERVER: "$HOME_NET"

    port-groups:
      HTTP_PORTS: "80"
      SHELLCODE_PORTS: "!80"
      ORACLE_PORTS: 1521
      SSH_PORTS: 22
      DNP3_PORTS: 20000
      MODBUS_PORTS: 502
      FILE_DATA_PORTS: "[$HTTP_PORTS,110,143]"
      FTP_PORTS: 21
      VXLAN_PORTS: 4789
      TEREDO_PORTS: 3544
  default-log-dir: /var/log/suricata/
  stats:
    enabled: yes
    interval: 20
  runmode: workers
  outputs:
    - fast:
        enabled: no
        filename: fast-%Y-%m-%d-%H:%M:%S.log
        append: yes
        rotate-interval: 120s
    - eve-log:
        enabled: yes
        filetype: regular #regular|syslog|unix_dgram|unix_stream|redis
        filename: eve-%Y-%m-%d-%H:%M:%S.json
        pcap-file: false
        community-id: false
        community-id-seed: 0
        rotate-interval: 600s
        xff:
          enabled: no
          mode: extra-data
          deployment: reverse
          header: X-Forwarded-For

        types:
          - alert:
              tagged-packets: yes
          - anomaly:
              enabled: yes
              types:
          - http:
              extended: no     # enable this for extended logging information
          - dns:
          - tls:
              extended: yes     # enable this for extended logging information
          - files:
              force-magic: no   # force logging magic on all logged files
          - smtp:
          - ftp
          - nfs
          - smb
          - tftp
          - ikev2
          - krb5
          - snmp
          - dhcp:
              enabled: yes
              extended: no
          - ssh
          - stats:
              totals: yes      # stats for all threads merged together
              threads: no       # per thread stats
              deltas: no        # include delta values
           #- flow
           #- netflow
    - unified2-alert:
        enabled: no
    - http-log:
        enabled: no
        filename: http.log
        append: yes
    - tls-log:
        enabled: no  # Log TLS connections.
        filename: tls.log # File to store TLS logs.
        append: yes
    - tls-store:
        enabled: no
    - pcap-log:
        enabled: no
        filename: log.pcap
        limit: 1000mb
        max-files: 2000
        compression: none

        mode: normal # normal, multi or sguil.
        use-stream-depth: no #If set to "yes" packets seen after reaching stream inspection depth are ignored. "no" logs all packets
        honor-pass-rules: no # If set to "yes", flows in which a pass rule matched will stopped being logged.
    - alert-debug:
        enabled: no
        filename: alert-debug.log
        append: yes
    - alert-prelude:
        enabled: no
        profile: suricata
        log-packet-content: no
        log-packet-header: yes
    - stats:
        enabled: no
        filename: stats.log
        append: yes       # append to file (yes) or overwrite it (no)
        totals: yes       # stats for all threads merged together
        threads: no       # per thread stats
        rotate-interval: 14400s
    - syslog:
        enabled: no
        facility: local5
    - drop:
        enabled: no
    - file-store:
        version: 2
        enabled: no
        xff:
          enabled: no
          mode: extra-data
          deployment: reverse
          header: X-Forwarded-For
    - file-store:
        enabled: no
    - tcp-data:
        enabled: no
        type: file
        filename: tcp-data.log
    - http-body-data:
        enabled: no
        type: file
        filename: http-data.log
    - lua:
        enabled: no
        scripts:
  logging:
    default-log-level: notice
    default-output-filter:
    outputs:
    - console:
        enabled: yes
    - file:
        enabled: yes
        level: info
        filename: suricata.log
    - syslog:
        enabled: no
        facility: local5
        format: "[%i] <%d> -- "
  af-packet:
    - interface: ens192
      threads: 8
      defrag: no
      cluster-type: cluster_ebpf
      ebpf-lb-file: /usr/libexec/suricata/ebpf/lb.bpf
      cluster-id: 98
      copy-mode: tap
      tpacket-v3: no
      copy-iface: ens224
      buffer-size: 131072
      use-mmap: yes
      mmap-locked: yes
      #ring-size: 100000
    - interface: ens224
      threads: 8
      cluster-id: 97
      defrag: no
      cluster-type: cluster_ebpf
      ebpf-lb-file: /usr/libexec/suricata/ebpf/lb.bpf
      copy-mode: tap
      tpacket-v3: no
      copy-iface: ens192
      buffer-size: 131072
      use-mmap: yes
      mmap-locked: yes
      #ring-size: 100000
  pcap:
    - interface: eth0
    - interface: default
  pcap-file:
    checksum-checks: auto
  app-layer:
    protocols:
      krb5:
        enabled: yes
      snmp:
        enabled: yes
      ikev2:
        enabled: yes
      tls:
        enabled: yes
        detection-ports:
          dp: 443

      dcerpc:
        enabled: yes
      ftp:
        enabled: yes
      rdp:
      ssh:
        enabled: yes
      smtp:
        enabled: yes
        raw-extraction: no
        mime:
          decode-mime: yes
          decode-base64: yes
          decode-quoted-printable: yes
          header-value-depth: 2000
          extract-urls: yes
          body-md5: no
        inspected-tracker:
          content-limit: 100000
          content-inspect-min-size: 32768
          content-inspect-window: 4096
      imap:
        enabled: detection-only
      smb:
        enabled: yes
        detection-ports:
          dp: 139, 445

      nfs:
        enabled: yes
      tftp:
        enabled: yes
      dns:

        tcp:
          enabled: yes
          detection-ports:
            dp: 53
        udp:
          enabled: yes
          detection-ports:
            dp: 53
      http:
        enabled: yes
        libhtp:
          default-config:
            personality: IDS
            request-body-limit: 100kb
            response-body-limit: 100kb
            request-body-minimal-inspect-size: 32kb
            request-body-inspect-window: 4kb
            response-body-minimal-inspect-size: 40kb
            response-body-inspect-window: 16kb
            response-body-decompress-layer-limit: 2
            http-body-inline: auto
            swf-decompression:
              enabled: yes
              type: both
              compress-depth: 0
              decompress-depth: 0
            double-decode-path: no
            double-decode-query: no

          server-config:
      modbus:

        enabled: no
        detection-ports:
          dp: 502
        stream-depth: 0
      dnp3:
        enabled: no
        detection-ports:
          dp: 20000
      enip:
        enabled: no
        detection-ports:
          dp: 44818
          sp: 44818

      ntp:
        enabled: yes

      dhcp:
        enabled: yes
      sip:
  asn1-max-frames: 256

  coredump:
    max-dump: unlimited
  host-mode: auto
  unix-command:
    enabled: auto

  legacy:
    uricontent: enabled
  engine-analysis:
    rules-fast-pattern: yes
    rules: yes
  pcre:
    match-limit: 3500
    match-limit-recursion: 1500
  host-os-policy:
    windows: [0.0.0.0/0]
    bsd: []
    bsd-right: []
    old-linux: []
    linux: []
    old-solaris: []
    solaris: []
    hpux10: []
    hpux11: []
    irix: []
    macos: []
    vista: []
    windows2k3: []

  defrag:
    memcap: 32mb
    hash-size: 65536
    trackers: 65535 # number of defragmented flows to follow
    max-frags: 65535 # number of fragments to keep (higher than trackers)
    prealloc: yes
    timeout: 60

  flow:
    memcap: 1024mb
    hash-size: 65536
    prealloc: 10000
    emergency-recovery: 30
  vlan:
    use-for-tracking: true

  flow-timeouts:

    default:
      new: 30
      established: 300
      closed: 0
      bypassed: 100
      emergency-new: 10
      emergency-established: 100
      emergency-closed: 0
      emergency-bypassed: 50
    tcp:
      new: 60
      established: 600
      closed: 60
      bypassed: 100
      emergency-new: 5
      emergency-established: 100
      emergency-closed: 10
      emergency-bypassed: 50
    udp:
      new: 30
      established: 300
      bypassed: 100
      emergency-new: 10
      emergency-established: 100
      emergency-bypassed: 50
    icmp:
      new: 30
      established: 300
      bypassed: 100
      emergency-new: 10
      emergency-established: 100
      emergency-bypassed: 50
  stream:
    memcap: 1024mb
    checksum-validation: yes      # reject wrong csums
    inline: auto                  # auto will use inline mode in IPS mode, yes or no set it statically
    reassembly:
      memcap: 256mb
      depth: 1mb                  # reassemble 1mb into a stream
      toserver-chunk-size: 2560
      toclient-chunk-size: 2560
      randomize-chunk-size: yes
  host:
    hash-size: 4096
    prealloc: 1000
    memcap: 32mb

  decoder:
    teredo:
      enabled: true
      ports: $TEREDO_PORTS # syntax: '[3544, 1234]' or '3533' or 'any'.
    vxlan:
      enabled: true
      ports: $VXLAN_PORTS # syntax: '8472, 4789'
    erspan:
      typeI:
        enabled: false
  detect:
    profile: high
    sgh-mpm-context: auto
    inspection-recursion-limit: 3000

    prefilter:
      default: mpm
    grouping:

    profiling:
      grouping:
        dump-to-disk: false
        include-rules: false      # very verbose
        include-mpm-stats: false

  mpm-algo: hs #ac hs

  spm-algo: hs #bm hs
  threading:
    set-cpu-affinity: yes
    cpu-affinity:
      - management-cpu-set:
          cpu: [ 0 ]  # include only these CPUs in affinity settings
      - receive-cpu-set:
          cpu: [ 1-3 ]  # include only these CPUs in affinity settings
      - worker-cpu-set:
          cpu: [ 1-7 ]
    detect-thread-ratio: 1

  luajit:
    states: 128
  profiling:
    rules:
      enabled: yes
      filename: rule_perf.log
      append: yes
      limit: 10
      json: yes
    keywords:
      enabled: yes
      filename: keyword_perf.log
      append: yes

    prefilter:
      enabled: yes
      filename: prefilter_perf.log
      append: yes
    rulegroups:
      enabled: yes
      filename: rule_group_perf.log
      append: yes
    packets:
      enabled: yes
      filename: packet_stats.log
      append: yes
      csv:
        enabled: no
        filename: packet_stats.csv
    locks:
      enabled: no
      filename: lock_stats.log
      append: yes

    pcap-log:
      enabled: no
      filename: pcaplog_stats.log
      append: yes
  nfq:
  nflog:
    - group: 2
      buffer-size: 18432
    - group: default
      qthreshold: 1
      qtimeout: 100
      max-size: 20000
  capture:
  netmap:
  - interface: eth2
  - interface: default
  pfring:
    - interface: eth0
      threads: auto
      cluster-id: 99
      cluster-type: cluster_flow
    - interface: default
  ipfw:


  napatech:
      streams: ["0-3"]
      auto-config: yes
      ports: [all]
      hashmode: hash5tuplesorted

  default-rule-path: /var/lib/suricata/rules

  rule-files:
    - suricata.rules

  classification-file: /etc/suricata/classification.config
  reference-config-file: /etc/suricata/reference.config

  max-pending-packets: 10000

I managed to get some more info with bpftool. It seems some BPF programs are automatically loaded at startup and that was driving me crazy:

2: cgroup_skb  tag 7be49e3934a125ba  gpl
        loaded_at 2021-04-05T17:22:22+0200  uid 0
        xlated 296B  jited 200B  memlock 4096B  map_ids 2,3
3: cgroup_skb  tag 2a142ef67aaad174  gpl
        loaded_at 2021-04-05T17:22:22+0200  uid 0
        xlated 296B  jited 200B  memlock 4096B  map_ids 2,3
4: cgroup_skb  tag 7be49e3934a125ba  gpl
        loaded_at 2021-04-05T17:22:22+0200  uid 0
        xlated 296B  jited 200B  memlock 4096B  map_ids 4,5
5: cgroup_skb  tag 2a142ef67aaad174  gpl
        loaded_at 2021-04-05T17:22:22+0200  uid 0
        xlated 296B  jited 200B  memlock 4096B  map_ids 4,5
6: cgroup_skb  tag 7be49e3934a125ba  gpl
        loaded_at 2021-04-05T17:22:26+0200  uid 0
        xlated 296B  jited 200B  memlock 4096B  map_ids 6,7
7: cgroup_skb  tag 2a142ef67aaad174  gpl
        loaded_at 2021-04-05T17:22:26+0200  uid 0
        xlated 296B  jited 200B  memlock 4096B  map_ids 6,7

After some digging I manage to find out that the ebpf programs initiated by Suricata are indeed being loaded:

162: socket_filter  name lb  tag aea71c9d2ee3224c  gpl
        loaded_at 2021-04-05T17:36:01+0200  uid 0
        xlated 1792B  jited 972B  memlock 4096B
166: socket_filter  name lb  tag aea71c9d2ee3224c  gpl
        loaded_at 2021-04-05T17:36:02+0200  uid 0
        xlated 1792B  jited 972B  memlock 4096B

PD: We are running Suricata within Kubernetes in Ubuntu 18.04 with the latest kernel:

5.4.0-70-generic #78~18.04.1-Ubuntu SMP Sat Mar 20 14:10:07 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux