Custom Content Detection

Hello there,

I’ve already asked this question on discord, but thought this might be a better place.
I was wondering if there’s any future plans to be able to register a custom C (or Rust) content matcher.

The current plugin support is quite amazing, StreamingLogger, TXLoggers, Eveloggers, but its missing a custom matcher for (sticky) buffers and payloads like currently possible in Lua. Unfortunately it’s missing some useful keywords, for example websocket.payload (?) The current matcher
Additionally, Lua is not the most performant and given that the only (afaik) workaround to load some external matcher (from C) will be disabled in 8.0.0, it would be beneficial to have a more efficient and flexible solution built directly into the system.

Thank you for your time and consideration.

Hi there,

while this wasn’t answered here, maybe the upcoming Suricata community brainstorm could be a good moment to bring this up again :wink:

Suricata Roadmap Community Brainstorm virtual sessions - pre-SuriCon 2024

We are currently running a small (and dirty) patch in the end of the DetectEngineContentInspectionInternal function that bypasses the BUG_ON(1); and instead calls sigmatch_table[smd->type].Transform(&inspectBuffer, smd->ctx);. (In case .Transform is not NULL).
We also required access to the packet (& flow) inside our custom plugin Transformer so we also added a Packet* to the InspectionBuffer.
It was convenient to use the Transform for both the Packet contents and StickyBuffers.
Quite some patches required to get the same functionality as the slow lua sadly (for now at least).

During this, we also noticed that Flowvars allocated by ScFlowvarSet / SCFlowvarSet in Lua do not get their name value free’d, while they are malloced in that function, resulting in a potential memory leak, we thought this was worth mentioning. The same happens for the values, but they are free’d on flowvar free.

As another thing we had to debug through, the sigmatch_table is only allocated after plugins are loaded.
We were dynamically adding sigmatch keywords. However, the sigmatch_table would be NULL, so when we called DetectHelperKeywordRegister it would allocate a size 256 table, while our new keyword idx would be DETECT_TBLSIZE_IDX >= 256 therefore doing a heap OOB, then on the second keyword registration it would reallocate to 2 * 256, enough to fit everything, but miss the first dynamically registered keyword. We currently fixed it with the following in our plugin registration:

static void InitPlugin(void)
{
    /* Copied from SigTableSetup
     * It would do a heap overflow if not for this..
     */
    if (sigmatch_table == NULL) {
        DETECT_TBLSIZE = DETECT_TBLSIZE_STATIC + DETECT_TBLSIZE_STEP;
        sigmatch_table = SCCalloc(DETECT_TBLSIZE, sizeof(SigTableElmt));
        if (sigmatch_table == NULL) {
            DETECT_TBLSIZE = 0;
            FatalError("Could not allocate sigmatch_table");
            return;
        }
    }
    // .. register plugins here

(We also found Bug #7285: Websocket compression mishandling - Suricata - Open Information Security Foundation & added the dirty pcap-over-ip patch Feature #5499: PCAP-over-IP client - Suricata - Open Information Security Foundation)

Thanks for the bug report on Redmine, and for sharing a patch, as Victor mentioned on the Redmine ticket, would you be interested in submitting a cleaned up version of the patch to the project on Github?

Potentially,
I am currently more interested in porting this custom content detection into Suricata, having to update the patches for newer commits can be a bit annoying.

As I’ve explained before, we added a patch to allow our custom Transformers access to the packet* (More interesting the Flow*), we would also like to discuss how we could potentially add this into suricata.

Now for (another) bug we came across, lets say I want to use a custom logger to log a TCP stream, the timestamp (I get from the flow or the packet upon reassembly) does not correspond with the actual timestamp of the packet. This makes sense because the TCP reassembly is presumably dependent on receiving an ACK from the server and only when this ACK has been received it will accept this chunk into its reassembly.

Now this is a clear problem, because TCP can have selective ACKs, which means that an acknowledgement might not be immediately sent upon receiving a packet, the first ‘ACK’ it will then receive might be a data packet from the server.

At this point, both packets will be sent into reassembly and flow->lastts and packet->ts will be incorrect for the client->server packet in the TCP reassembly handler, so the client->server packet time will be the same as the server->client packet time. Now when we log these packets to the database and then order by time, the order might be wrong as they have the same time, which can cause our post traffic analysis tools to return undefined behaviour.

Apart from this, we also noticed another bug, when stream.checksum-validation is set to false, apkt will get the PKT_IGNORE_CHECKSUM flag.

Now, lets say I still want to detect invalid checksums despite this validation being disabled, this is currently impossible without patches because when this PKT_IGNORE_CHECKSUM flag is set and therefore, the checksum detection just returns the cd->valid, which is taken from rule and not the packet. This makes all checksum rules useless when stream.checksum-validation is set to false.

Also lets say I do enable this, then all packets with an invalid checksum are not streamed through the tcp reassembler, so this doesn’t help me either.
A case where this might happen is when a router strips tcp options from a syn packet and therefore the checksum might be incorrect, which we want to detect, but not drop, which is impossible.
Simply, I want all packets to pass the tcp reassembler, but still be able to detect wrong checksums.
This was easily done by commenting out all the:

// if (p->flags & PKT_IGNORE_CHECKSUM) {
//     return cd->valid;
// }

in detect-csum.c