SIGABRT w/ Suricata 7.0.4/7.0.5 in af-packet mode

Hi,

since 7.0.4, I recognized multiple coredumps due to a SIGABRT. This was the case on different hosts (all on debian 12.5) and on different interfaces with different numbers of threads and cpu cores.

Core was generated by `/usr/bin/suricata -c /etc/suricata/suricata.yaml --pidfile /var/run/suricata.'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (threadid=<optimized out>, 
    signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
44	./nptl/pthread_kill.c: Datei oder Verzeichnis nicht gefunden.
[Current thread is 1 (Thread 0x7faa1f7ff6c0 (LWP 64476))]
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, 
    signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x00007fab2acf2e8f in __pthread_kill_internal (signo=6, 
    threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x00007fab2aca3fb2 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/posix/raise.c:26
#3  0x00007fab2ac8e472 in __GI_abort () at ./stdlib/abort.c:79
#4  0x00007fab2ace7430 in __libc_message (action=action@entry=do_abort, 
    fmt=fmt@entry=0x7fab2ae01459 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#5  0x00007fab2acfc7aa in malloc_printerr (
    str=str@entry=0x7fab2ae04590 "malloc(): invalid next size (unsorted)")
    at ./malloc/malloc.c:5660
#6  0x00007fab2acff8e4 in _int_malloc (av=av@entry=0x7fab18000030, 
    bytes=bytes@entry=36865) at ./malloc/malloc.c:4001
#7  0x00007fab2ad00362 in _int_realloc (av=av@entry=0x7fab18000030, 
    oldp=oldp@entry=0x7f8cf4ffd710, oldsize=oldsize@entry=34832, 
    nb=nb@entry=36880) at ./malloc/malloc.c:4874
#8  0x00007fab2ad0120f in __GI___libc_realloc (
    oldmem=oldmem@entry=0x7f8cf4ffd720, bytes=bytes@entry=36864)
    at ./malloc/malloc.c:3489
#9  0x000056029421562e in SCReallocFunc (ptr=ptr@entry=0x7f8cf4ffd720, 
    size=size@entry=36864) at ./src/util-mem.c:46
#10 0x00005602942f1c45 in StreamTcpReassembleRealloc (optr=0x7f8cf4ffd720, 
    orig_size=34816, size=36864) at ./src/stream-tcp-reassemble.c:236
#11 0x000056029432149b in GrowRegionToSize (sb=0x7faa06beb3c8, size=35046, 
    region=0x7faa06beb3c8, cfg=0x560294af2198 <stream_config+56>)
    at ./src/util-streaming-buffer.c:722
#12 GrowToSize (size=35046, cfg=0x560294af2198 <stream_config+56>, 
    sb=0x7faa06beb3c8) at ./src/util-streaming-buffer.c:746
#13 StreamingBufferInsertAt (sb=sb@entry=0x7faa06beb3c8, 
    cfg=0x560294af2198 <stream_config+56>, seg=seg@entry=0x7faa12710ba4, 
    data=0x7f9cf7604000 "", data_len=<optimized out>, offset=<optimized out>)
    at ./src/util-streaming-buffer.c:1526
#14 0x00005602942f0d99 in InsertSegmentDataCustom (data_len=1298, 
    data=0x7f9cf7604000 "", seg=0x7faa12710b80, stream=0x7faa06beb390)
    at ./src/stream-tcp-list.c:99
#15 StreamTcpReassembleInsertSegment (tv=tv@entry=0x56036d7d8090, 
    ra_ctx=ra_ctx@entry=0x7faa11b7f450, stream=stream@entry=0x7faa06beb390, 
    seg=0x7faa12710b80, p=p@entry=0x7faa11b46580, pkt_seq=<optimized out>, 
    pkt_data=<optimized out>, pkt_datalen=<optimized out>)
    at ./src/stream-tcp-list.c:654
#16 0x00005602942f38a1 in StreamTcpReassembleHandleSegmentHandleData (
    tv=tv@entry=0x56036d7d8090, ra_ctx=ra_ctx@entry=0x7faa11b7f450, 
    ssn=ssn@entry=0x7faa06beb380, stream=stream@entry=0x7faa06beb390, 
    p=p@entry=0x7faa11b46580)
    at /usr/include/x86_64-linux-gnu/bits/byteswap.h:52
#17 0x00005602942f3aba in StreamTcpReassembleHandleSegment (
    tv=tv@entry=0x56036d7d8090, ra_ctx=0x7faa11b7f450, 
    ssn=ssn@entry=0x7faa06beb380, stream=0x7faa06beb390, 
    p=p@entry=0x7faa11b46580) at ./src/stream-tcp-reassemble.c:2016
#18 0x00005602942ea17e in HandleEstablishedPacketToClient (
    stt=<optimized out>, p=<optimized out>, ssn=<optimized out>, 
    tv=<optimized out>) at ./src/stream-tcp.c:2777
#19 StreamTcpPacketStateEstablished (tv=0x56036d7d8090, p=0x7faa11b46580, 
    stt=stt@entry=0x7faa11b7f1a0, ssn=0x7faa06beb380)
    at ./src/stream-tcp.c:3223
#20 0x00005602942ec598 in StreamTcpStateDispatch (tv=tv@entry=0x56036d7d8090, 
    p=p@entry=0x7faa11b46580, stt=stt@entry=0x7faa11b7f1a0, 
    ssn=ssn@entry=0x7faa06beb380, state=<optimized out>)
    at ./src/stream-tcp.c:5236
#21 0x00005602942edeb6 in StreamTcpPacket (tv=tv@entry=0x56036d7d8090, 
    p=p@entry=0x7faa11b46580, stt=stt@entry=0x7faa11b7f1a0, 
    pq=pq@entry=0x7faa11b652a0) at ./src/stream-tcp.c:5433
#22 0x00005602942ef0f9 in StreamTcp (tv=tv@entry=0x56036d7d8090, 
    p=p@entry=0x7faa11b46580, data=0x7faa11b7f1a0, pq=pq@entry=0x7faa11b652a0)
    at ./src/stream-tcp.c:5745
#23 0x00005602942ac575 in FlowWorkerStreamTCPUpdate (
    tv=tv@entry=0x56036d7d8090, fw=fw@entry=0x7faa11b65270, 
    p=p@entry=0x7faa11b46580, 
    detect_thread=detect_thread@entry=0x7faa06def450, 
    timeout=timeout@entry=false) at ./src/flow-worker.c:391
#24 0x00005602942acae7 in FlowWorker (tv=0x56036d7d8090, p=0x7faa11b46580, 
    data=0x7faa11b65270) at ./src/flow-worker.c:619
#25 0x000056029420295f in TmThreadsSlotVarRun (tv=tv@entry=0x56036d7d8090, 
    p=p@entry=0x7faa11b46580, slot=<optimized out>) at ./src/tm-threads.c:135
#26 0x00005602942d9f24 in TmThreadsSlotProcessPkt (tv=0x56036d7d8090, 
    s=<optimized out>, p=0x7faa11b46580) at ./src/tm-threads.h:200
#27 0x00005602942da262 in AFPParsePacketV3 (pbd=0x7f9cf7600000, 
    ppd=0x7f9cf7603f68, ptv=0x7faa11b46fc0) at ./src/source-af-packet.c:1013
#28 AFPWalkBlock (pbd=0x7f9cf7600000, ptv=0x7faa11b46fc0)
    at ./src/source-af-packet.c:1032
#29 AFPReadFromRingV3 (ptv=ptv@entry=0x7faa11b46fc0)
    at ./src/source-af-packet.c:1079
#30 0x00005602942dade4 in ReceiveAFPLoop (tv=0x56036d7d8090, 
    data=0x7faa11b46fc0, slot=<optimized out>) at ./src/source-af-packet.c:1431
#31 0x00005602942040b4 in TmThreadsSlotPktAcqLoop (td=0x56036d7d8090)
    at ./src/tm-threads.c:318
#32 0x00007fab2acf1134 in start_thread (arg=<optimized out>)
    at ./nptl/pthread_create.c:442
#33 0x00007fab2ad717dc in clone3 ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

I usually compile suricata using the debian src package w/ some modifications, but these are completely unrelated to the code mentioned above. The configure options are

CONFIGURE_ARGS = --enable-af-packet --enable-xdp --enable-dpdk \
	--enable-gccprotect --disable-gccmarch-native \
	--with-libnss-includes=/usr/include/nss --with-libnss-libraries=/usr/lib/$(DEB_HOST_MULTIARCH) \
	--with-libnspr-includes=/usr/include/nspr --with-libnspr-libraries=/usr/lib/$(DEB_HOST_MULTIARCH) \
	--with-libevent-includes=/usr/include --with-libevent-libraries=/usr/lib/$(DEB_HOST_MULTIARCH) \
	--disable-coccinelle \
	--enable-geoip --enable-hiredis \
	--enable-non-bundled-htp \
	--disable-suricata-update \
	$(ENABLE_LUAJIT) \
	$(ENABLE_HYPERSCAN) \
	$(ENABLE_UNITTESTS) \
	$(ENABLE_EBPF)

Did someone else experience this, and what can I do to further investigate this?

Thanks, maja

Are you able to rebuild with address sanitizer enabled? Looks like a memory corruption based on the bt. ASAN usually catches things a bit earlier.

Thanks, I recompiled it w/ “–enable-asan”.

Unfortunately, it takes a few days until a core dump occurs.
I will report back if I have one.

Ok, --enable-asan did not have any effect. I had to set the CFLAGS to -fsanitize=address,undefined -Wformat -Werror=format-security -Werror=array-bounds -g -O0.

With that, the address sanitizer now complaints about a heap buffer overflow. But indeed in one of our added code for base64 converting and dumping the first few bytes of http request and response headers.

Thus, even if this is not solved in the narrow sense, I will mark the topic as solved.

But thanks anyway for pointing me at the ASAN feature of gcc.

After running 7.0.6 for 4 weeks or so without any problems on multiple gigabits and >28 cores in af-packet mode, I experienced some more coredumps. Some were due to SIGABRT and others due to SIGSEGV.

Unfortunately, I cannot run this with additional debug measures, since it takes so long to core dump. I do not expect that someone is able to fix this by looking at the BT, but I cannot deliver more atm.

This is a gdb BT for a coredump due to a SIGABRT.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/suricata -c /etc/suricata/suricata.yaml --pidfile /var/run/suricata.pid'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (threadid=<optimized out>, 
    signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
44    ./nptl/pthread_kill.c: Datei oder Verzeichnis nicht gefunden.
[Current thread is 1 (Thread 0x7f8cfa8e06c0 (LWP 4148687))]
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, 
    signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x00007f8d07af2e8f in __pthread_kill_internal (signo=6, 
    threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x00007f8d07aa3fb2 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/posix/raise.c:26
#3  0x00007f8d07a8e472 in __GI_abort () at ./stdlib/abort.c:79
#4  0x00007f8d07ae7430 in __libc_message (action=action@entry=do_abort, 
    fmt=fmt@entry=0x7f8d07c01459 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#5  0x00007f8d07afc7aa in malloc_printerr (
    str=str@entry=0x7f8d07c04590 "malloc(): invalid next size (unsorted)")
    at ./malloc/malloc.c:5660
#6  0x00007f8d07aff8e4 in _int_malloc (av=av@entry=0x7f8cf4000030, 
    bytes=bytes@entry=4097) at ./malloc/malloc.c:4001
#7  0x00007f8d07b00362 in _int_realloc (av=av@entry=0x7f8cf4000030, 
    oldp=oldp@entry=0x7f7c8e0efeb0, oldsize=oldsize@entry=2064, 
    nb=nb@entry=4112) at ./malloc/malloc.c:4874
#8  0x00007f8d07b0120f in __GI___libc_realloc (oldmem=0x7f7c8e0efec0, 
    bytes=4096) at ./malloc/malloc.c:3489
#9  0x0000563407b4162e in SCReallocFunc ()
#10 0x0000563407c1e915 in StreamTcpReassembleRealloc ()
#11 0x0000563407c4ddf8 in StreamingBufferInsertAt ()
#12 0x0000563407c1da69 in StreamTcpReassembleInsertSegment ()
#13 0x0000563407c20571 in StreamTcpReassembleHandleSegmentHandleData ()
#14 0x0000563407c2078a in StreamTcpReassembleHandleSegment ()
#15 0x0000563407c16e4e in ?? ()
#16 0x0000563407c19268 in ?? ()
#17 0x0000563407c1ab86 in StreamTcpPacket ()
#18 0x0000563407c1bdc9 in StreamTcp ()
#19 0x0000563407bd91a5 in ?? ()
#20 0x0000563407bd9717 in ?? ()
#21 0x0000563407b2e90f in TmThreadsSlotVarRun ()
#22 0x0000563407c06bf4 in ?? ()
#23 0x0000563407c06f32 in ?? ()
#24 0x0000563407c07ab4 in ?? ()
#25 0x0000563407b30064 in ?? ()
#26 0x00007f8d07af1134 in start_thread (arg=<optimized out>)
    at ./nptl/pthread_create.c:442
#27 0x00007f8d07b717dc in clone3 ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

This is one was due to a SIGSEGV:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/suricata -c /etc/suricata/suricata.yaml --pidfile /var/run/suricata.pid'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  _int_malloc (av=av@entry=0x7f1fb8000030, bytes=bytes@entry=4096)
    at ./malloc/malloc.c:4129
4129    ./malloc/malloc.c: Datei oder Verzeichnis nicht gefunden.
[Current thread is 1 (Thread 0x7f1f2d7ff6c0 (LWP 3696699))]
(gdb) bt
#0  _int_malloc (av=av@entry=0x7f1fb8000030, bytes=bytes@entry=4096)
    at ./malloc/malloc.c:4129
#1  0x00007f1ffc61f8f9 in __GI___libc_malloc (bytes=4096)
    at ./malloc/malloc.c:3323
#2  0x000055f7f40f93b7 in ?? ()
#3  0x000055f7f40f9cd1 in alloc::string::String::try_reserve::h70d7f11456287616
    ()
#4  0x000055f7f3e01a23 in suricata::jsonbuilder::JsonBuilder::try_new_object_with_capacity::hf6e50a27e0b3d6d4 ()
#5  0x000055f7f3e08816 in jb_new_object ()
#6  0x000055f7f3c80d6e in CreateEveHeader ()
#7  0x000055f7f3c81121 in CreateEveHeaderWithTxId ()
#8  0x000055f7f3c8d9dd in ?? ()
#9  0x000055f7f3c95c98 in ?? ()
#10 0x000055f7f3c78904 in OutputLoggerLog ()
#11 0x000055f7f3c73217 in ?? ()
#12 0x000055f7f3c73717 in ?? ()
#13 0x000055f7f3bc890f in TmThreadsSlotVarRun ()
#14 0x000055f7f3ca0bf4 in ?? ()
#15 0x000055f7f3ca0f32 in ?? ()
#16 0x000055f7f3ca1ab4 in ?? ()
#17 0x000055f7f3bca064 in ?? ()
#18 0x00007f1ffc610134 in start_thread (arg=<optimized out>)
    at ./nptl/pthread_create.c:442
#19 0x00007f1ffc6907dc in clone3 ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Could you also add the suricata --build-info and stats.log and suricata.yaml?

Yeah, but I would like not to post it here. May I send it to you in private?

You can send it to me via a DM here or just remove the sensitive data from those files.

Problem still exists.

I was wondering whether an overflow could occur during calculation of the “diff” variable in the GrowRegionToSize() function in util-streaming-buffer.c so that the memset may crash:

    /* for safe printing and general caution, lets memset the
     * new data to 0 */
    size_t diff = grow - region->buf_size;
    void *new_mem = ((char *)ptr) + region->buf_size;
    memset(new_mem, 0, diff);

Another try. Still an occasional SEGFAULT in util-streaming-buffer.c

Core was generated by `/usr/bin/suricata -c /etc/suricata/suricata.yaml --pidfile /var/log/suricata/s'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __memset_evex_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:236
236	../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Datei oder Verzeichnis nicht gefunden.

Backtrace gives:

#2  GrowRegionToSize (sb=0x7f2a1f50f4b0, size=<optimized out>, 
    region=0x7f2a1f50f4b0, cfg=0x55ef1cbd5378 <stream_config+56>)
    at ./src/util-streaming-buffer.c:736
warning: Source file is more recent than executable.
736	        
(gdb) p grow
$1 = 3055616
(gdb) p region->buf_size
$2 = 3057664
(gdb) p grow - region->buf_size
$3 = 4294965248

Doesn’t grow need to be be bigger than region->buf_size?
It shall be a multiple of it, and it is determined in ToNextMultipleOf().

If it is not, the diff variable might have an unsigned overflow. Am I correct?

Could you fill a full report with that segfault in our redmine at https://redmine.openinfosecfoundation.org/ and feel free to link it here afterwards