Af-xdp cannot run as normal after build suricata source code with xdp flag

rchatsiri · December 17, 2022, 5:32am

Hello Suricata developer,

I found an issues during start the suricata after try to build source code with the libbpf. The steps after run command line with suricata.yaml

Command line support build steps.

./configure --with-libhtp-includes=/disk-01/workspacecpp/gdrel-rengine/3rd-libs/libhtp/build/include --with-libhtp-libraries=/disk-01/workspacecpp/gdrel-rengine/3rd-libs/libhtp/build/lib/ --prefix=/disk-01/workspacecpp/gdrel-rengine/build --enable-debug --enable-ebpf --enable-ebpf-build --with-clang=/usr/bin/clang

Example configuration files.

af-packet:
  - interface: eth1
    # Number of receive threads. "auto" uses the number of cores
    threads: 1
    # Default clusterid. AF_PACKET will load balance packets based on flow.
    cluster-id: 99
    # Default AF_PACKET cluster type. AF_PACKET can load balance per flow or per hash.
    # This is only supported for Linux kernel > 3.1
    # possible value are:
    #  * cluster_flow: all packets of a given flow are sent to the same socket
    #  * cluster_cpu: all packets treated in kernel by a CPU are sent to the same socket
    #  * cluster_qm: all packets linked by network card to a RSS queue are sent to the same
    #  socket. Requires at least Linux 3.14.
    #  * cluster_ebpf: eBPF file load balancing. See doc/userguide/capture-hardware/ebpf-xdp.rst for
    #  more info.
    # Recommended modes are cluster_flow on most boxes and cluster_cpu or cluster_qm on system
    # with capture card using RSS (requires cpu affinity tuning and system IRQ tuning)
    cluster-type: cluster_qm
    # In some fragmentation cases, the hash can not be computed. If "defrag" is set
    # to yes, the kernel will do the needed defragmentation before sending the packets.
    defrag: yes
    bypass: yes
    ebpf-filter-file: /disk-01/workspacecpp/gdrel-rengine/ebpf/xdp_filter.bpf
    # To use the ring feature of AF_PACKET, set 'use-mmap' to yes
    use-mmap: yes
    # Lock memory map to avoid it being swapped. Be careful that over
    # subscribing could lock your system
    #mmap-locked: yes
    # Use tpacket_v3 capture mode, only active if use-mmap is true
    # Don't use it in IPS or TAP mode as it causes severe latency
    #tpacket-v3: yes
    # Ring size will be computed with respect to "max-pending-packets" and number
    # of threads. You can set manually the ring size in number of packets by setting
    # the following value. If you are using flow "cluster-type" and have really network
    # intensive single-flow you may want to set the "ring-size" independently of the number
    # of threads:
    ring-size: 2048
    # Block size is used by tpacket_v3 only. It should set to a value high enough to contain
    # a decent number of packets. Size is in bytes so please consider your MTU. It should be
    # a power of 2 and it must be multiple of page size (usually 4096).
    #block-size: 32768
    # tpacket_v3 block timeout: an open block is passed to userspace if it is not
    # filled after block-timeout milliseconds.
    #block-timeout: 10
    # On busy systems, set it to yes to help recover from a packet drop
    # phase. This will result in some packets (at max a ring flush) not being inspected.
    #use-emergency-flush: yes
    # recv buffer size, increased value could improve performance
    # buffer-size: 32768
    # Set to yes to disable promiscuous mode
    # disable-promisc: no
    # Choose checksum verification mode for the interface. At the moment
    # of the capture, some packets may have an invalid checksum due to
    # the checksum computation being offloaded to the network card.
    # Possible values are:
    #  - kernel: use indication sent by kernel for each packet (default)
    #  - yes: checksum validation is forced
    #  - no: checksum validation is disabled
    #  - auto: Suricata uses a statistical approach to detect when
    #  checksum off-loading is used.
    # Warning: 'capture.checksum-validation' must be set to yes to have any validation
    #checksum-checks: kernel
    # BPF filter to apply to this interface. The pcap filter syntax applies here.
    #bpf-filter: port 80 or udp
    # You can use the following variables to activate AF_PACKET tap or IPS mode.
    # If copy-mode is set to ips or tap, the traffic coming to the current
    # interface will be copied to the copy-iface interface. If 'tap' is set, the
    # copy is complete. If 'ips' is set, the packet matching a 'drop' action
    # will not be copied.
    #copy-mode: ips
    #copy-iface: eth1
    #  For eBPF and XDP setup including bypass, filter and load balancing, please
    #  see doc/userguide/capture-hardware/ebpf-xdp.rst for more info.

  # Put default values here. These will be used for an interface that is not
  # in the list above.
  - interface: eth1
    #threads: auto
    #use-mmap: no
    #tpacket-v3: yes

# Linux high speed af-xdp capture support         
af-xdp:
  - interface: eth1
    # Number of receive threads. "auto" uses least between the number
    # of cores and RX queues
    threads: 1
    #disable-promisc: false
    # XDP_DRV mode can be chosen when the driver supports XDP
    # XDP_SKB mode can be chosen when the driver does not support XDP
    # Possible values are:
    #  - drv: enable XDP_DRV mode
    #  - skb: enable XDP_SKB mode
    #  - none: disable (kernel in charge of applying mode)
    force-xdp-mode: drv
    # During socket binding the kernel will attempt zero-copy, if this
    # fails it will fallback to copy. If this fails, the bind fails.
    # The bind can be explicitly configured using the option below.
    # If configured, the bind will fail if not successful (no fallback).
    # Possible values are:
    #  - zero: enable zero-copy mode
vagrant@bullseye:/disk-01/workspacecpp/gdrel-rengine$ cat suricata.yaml  | grep -A 150 af-packet
af-packet:
  - interface: eth1
    # Number of receive threads. "auto" uses the number of cores
    threads: 1
    # Default clusterid. AF_PACKET will load balance packets based on flow.
    cluster-id: 99
    # Default AF_PACKET cluster type. AF_PACKET can load balance per flow or per hash.
    # This is only supported for Linux kernel > 3.1
    # possible value are:
    #  * cluster_flow: all packets of a given flow are sent to the same socket
    #  * cluster_cpu: all packets treated in kernel by a CPU are sent to the same socket
    #  * cluster_qm: all packets linked by network card to a RSS queue are sent to the same
    #  socket. Requires at least Linux 3.14.
    #  * cluster_ebpf: eBPF file load balancing. See doc/userguide/capture-hardware/ebpf-xdp.rst for
    #  more info.
    # Recommended modes are cluster_flow on most boxes and cluster_cpu or cluster_qm on system
    # with capture card using RSS (requires cpu affinity tuning and system IRQ tuning)
    cluster-type: cluster_qm
    # In some fragmentation cases, the hash can not be computed. If "defrag" is set
    # to yes, the kernel will do the needed defragmentation before sending the packets.
    defrag: yes
    bypass: yes
    ebpf-filter-file: /disk-01/workspacecpp/gdrel-rengine/ebpf/xdp_filter.bpf
    # To use the ring feature of AF_PACKET, set 'use-mmap' to yes
    use-mmap: yes
    # Lock memory map to avoid it being swapped. Be careful that over
    # subscribing could lock your system
    #mmap-locked: yes
    # Use tpacket_v3 capture mode, only active if use-mmap is true
    # Don't use it in IPS or TAP mode as it causes severe latency
    #tpacket-v3: yes
    # Ring size will be computed with respect to "max-pending-packets" and number
    # of threads. You can set manually the ring size in number of packets by setting
    # the following value. If you are using flow "cluster-type" and have really network
    # intensive single-flow you may want to set the "ring-size" independently of the number
    # of threads:
    ring-size: 2048
    # Block size is used by tpacket_v3 only. It should set to a value high enough to contain
    # a decent number of packets. Size is in bytes so please consider your MTU. It should be
    # a power of 2 and it must be multiple of page size (usually 4096).
    #block-size: 32768
    # tpacket_v3 block timeout: an open block is passed to userspace if it is not
    # filled after block-timeout milliseconds.
    #block-timeout: 10
    # On busy systems, set it to yes to help recover from a packet drop
    # phase. This will result in some packets (at max a ring flush) not being inspected.
    #use-emergency-flush: yes
    # recv buffer size, increased value could improve performance
    # buffer-size: 32768
    # Set to yes to disable promiscuous mode
    # disable-promisc: no
    # Choose checksum verification mode for the interface. At the moment
    # of the capture, some packets may have an invalid checksum due to
    # the checksum computation being offloaded to the network card.
    # Possible values are:
    #  - kernel: use indication sent by kernel for each packet (default)
    #  - yes: checksum validation is forced
    #  - no: checksum validation is disabled
    #  - auto: Suricata uses a statistical approach to detect when
    #  checksum off-loading is used.
    # Warning: 'capture.checksum-validation' must be set to yes to have any validation
    #checksum-checks: kernel
    # BPF filter to apply to this interface. The pcap filter syntax applies here.
    #bpf-filter: port 80 or udp
    # You can use the following variables to activate AF_PACKET tap or IPS mode.
    # If copy-mode is set to ips or tap, the traffic coming to the current
    # interface will be copied to the copy-iface interface. If 'tap' is set, the
    # copy is complete. If 'ips' is set, the packet matching a 'drop' action
    # will not be copied.
    #copy-mode: ips
    #copy-iface: eth1
    #  For eBPF and XDP setup including bypass, filter and load balancing, please
    #  see doc/userguide/capture-hardware/ebpf-xdp.rst for more info.

  # Put default values here. These will be used for an interface that is not
  # in the list above.
  - interface: eth1
    #threads: auto
    #use-mmap: no
    #tpacket-v3: yes

# Linux high speed af-xdp capture support         
af-xdp:
  - interface: eth1
    # Number of receive threads. "auto" uses least between the number
    # of cores and RX queues
    threads: 1
    #disable-promisc: false
    # XDP_DRV mode can be chosen when the driver supports XDP
    # XDP_SKB mode can be chosen when the driver does not support XDP
    # Possible values are:
    #  - drv: enable XDP_DRV mode
    #  - skb: enable XDP_SKB mode
    #  - none: disable (kernel in charge of applying mode)
    force-xdp-mode: drv
    # During socket binding the kernel will attempt zero-copy, if this
    # fails it will fallback to copy. If this fails, the bind fails.
    # The bind can be explicitly configured using the option below.
    # If configured, the bind will fail if not successful (no fallback).
    # Possible values are:
    #  - zero: enable zero-copy mode
    #  - copy: enable copy mode
    #  - none: disable (kernel in charge of applying mode)
    force-bind-mode: zero
    # Memory alignment mode can vary between two modes, aligned and
    # unaligned chunk modes. By default, aligned chunk mode is selected.
    # select 'yes' to enable unaligned chunk mode.
    # Note: unaligned chunk mode uses hugepages, so the required number
    # of pages must be available.
    #mem-unaligned: no
    # The following options configure the prefer-busy-polling socket
    # options. The polling time and budget can be edited here.
    # Possible values are:
    #  - yes: enable (default)
    #  - no: disable
    #enable-busy-poll: yes
    # busy-poll-time sets the approximate time in microseconds to busy
    # poll on a blocking receive when there is no data.
    #busy-poll-time: 20
    # busy-poll-budget is the budget allowed for packet batches
    #busy-poll-budget: 64
    # These two tunables are used to configure the Linux OS's NAPI
    # context. Their purpose is to defer enabling of interrupts and
    # instead schedule the NAPI context from a watchdog timer.
    # The softirq NAPI will exit early, allowing busy polling to be
    # performed. Successfully setting these tunables alongside busy-polling
    # should improve performance.
    # Defaults are:
    #gro-flush-timeout: 2000000
    #napi-defer-hard-irq: 2

Error after run command line ```sudo gdb build/bin/suricata -c suricata.yaml -I eth1

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
17/12/2022 -- 00:49:37 - <Notice> - This is Suricata version 7.0.0-beta1 RELEASE running in SYSTEM mode
17/12/2022 -- 00:49:37 - <Notice> - Protocol detector and parser disabled for SSH.
17/12/2022 -- 00:49:37 - <Warning> - [ERRCODE: SC_ERR_FOPEN(44)] - Error opening file: "/disk-01/workspacecpp/gdrel-rengine/build/etc/suricata//threshold.config": No such file or directory
libbpf: elf: legacy map definitions in 'maps' section are not supported by libbpf v1.0+
17/12/2022 -- 00:49:37 - <Error> - [ERRCODE: SC_ERR_INVALID_VALUE(130)] - Unable to load eBPF objects in '/disk-01/workspacecpp/gdrel-rengine/ebpf/xdp_filter.bpf': Operation not supported
17/12/2022 -- 00:49:37 - <Warning> - [ERRCODE: SC_ERR_INVALID_VALUE(130)] - Error when loading eBPF filter file
[New Thread 0x7ffff68c0700 (LWP 100162)]
17/12/2022 -- 00:49:37 - <Error> - [ERRCODE: SC_ERR_INVALID_VALUE(130)] - Can't find eBPF map fd for 'flow_table_v6'
[New Thread 0x7ffff60bf700 (LWP 100163)]
[New Thread 0x7ffff58be700 (LWP 100164)]
[New Thread 0x7ffff50bd700 (LWP 100165)]
[New Thread 0x7ffff48bc700 (LWP 100166)]
[New Thread 0x7fffe7fff700 (LWP 100167)]
17/12/2022 -- 00:49:37 - <Notice> - Threads created -> W: 1 FM: 1 FR: 1   Engine started.

vjulien · December 19, 2022, 12:07pm

can you open a ticket for this? Seems like some of the ebpf needs updating.

Btw, this is still about af-packet, not af-xdp (which confusingly is something else than af-packet+xdp).

rchatsiri · December 23, 2022, 12:28pm

Hello @vjulien ,

I found an issues may be from legacy “Maps” declaration when build the Suricata source code with libbpf version upper 1.0.X. The Suricata source codes include the new BPF map structure declaration from the example[1] can compiled success.

[1] Libbpf: the road to v1.0 · libbpf/libbpf Wiki · GitHub

Andreas_Herz · January 17, 2023, 9:18pm

Please open the ticket at https://redmine.openinfosecfoundation.org/ as a bug with all the details.