Another case of high capture.kernel_drops count

Hi,

I’m trying to utilize suricata to capture then parse out http request & response(On public cloud, CentOS 7). Then use lua script(since suricata support it) to send the request & response content to somewhere else. The network load is about 35Mbps.

I understand that the parsing may be expensive so it may impact the performance. However the CPU is about 30% percent busy and memory is not fully used, the kernel_drops is about 20% and in some case it’s even up to 80%. Here is my configure file
suricata.yaml (71.0 KB), and the suricata version is 6.0.3. In the configuration file, basically I changed some configuration in af-packet section and some other performance tuning items given in https://suricata.readthedocs.io/en/latest/performance/index.html

Is there any clue or possible reason that cause the problem? Since the phenomenon is so abnormal I think maybe I have some config items wrong. I’ve tried to do the same as described in other thread which talk about high kernel_drops but seems it’s not helpinghttps://forum.suricata.io/t/suricata-high-capture-kernel-drops-count/465

Hi,

  1. can you also post a stats.log output?
  2. Does it also happen when the lua script is not active? I guess that would be the first test to try
  3. Can you even share (parts) the script?
  4. Only 1 thread for capture is not much, but on low traffic it should be okay

Hi Herz,

I never think I can get a reply so fast. Thanks!

  1. can you also post a stats.log output?
    Here is the stats.log
------------------------------------------------------------------------------------
Counter                                       | TM Name                   | Value
------------------------------------------------------------------------------------
capture.kernel_packets                        | Total                     | 44341814
capture.kernel_drops                          | Total                     | 12786240
decoder.pkts                                  | Total                     | 30768952
decoder.bytes                                 | Total                     | 14930538253
decoder.ipv4                                  | Total                     | 30768952
decoder.ethernet                              | Total                     | 30768952
decoder.tcp                                   | Total                     | 30768952
decoder.avg_pkt_size                          | Total                     | 485
decoder.max_pkt_size                          | Total                     | 1478
flow.tcp                                      | Total                     | 122601
flow.tcp_reuse                                | Total                     | 2
flow.wrk.spare_sync_avg                       | Total                     | 100
flow.wrk.spare_sync                           | Total                     | 845
flow.wrk.flows_evicted_needs_work             | Total                     | 37068
flow.wrk.flows_evicted_pkt_inject             | Total                     | 64124
flow.wrk.flows_evicted                        | Total                     | 1287
flow.wrk.flows_injected                       | Total                     | 36832
tcp.sessions                                  | Total                     | 89524
tcp.syn                                       | Total                     | 89530
tcp.synack                                    | Total                     | 89439
tcp.rst                                       | Total                     | 26629
tcp.reassembly_gap                            | Total                     | 1705971
tcp.overlap                                   | Total                     | 737
app_layer.flow.http                           | Total                     | 84828
app_layer.tx.http                             | Total                     | 5876683
flow.mgr.full_hash_pass                       | Total                     | 39
flow.spare                                    | Total                     | 10334
flow.mgr.rows_maxlen                          | Total                     | 3
flow.mgr.flows_checked                        | Total                     | 145449
flow.mgr.flows_notimeout                      | Total                     | 26532
flow.mgr.flows_timeout                        | Total                     | 118917
flow.mgr.flows_evicted                        | Total                     | 118917
flow.mgr.flows_evicted_needs_work             | Total                     | 37583
tcp.memuse                                    | Total                     | 912520
tcp.reassembly_memuse                         | Total                     | 79440616
http.memuse                                   | Total                     | 37409050
flow.memuse                                   | Total                     | 8802304
  1. Does it also happen when the lua script is not active? I guess that would be the first test to try
    No it’s not happen when the lua script is ignored. So I guess it’s the lua script which cause the problem? I’m trying to use coroutine to see if it helps.
  1. Can you even share (parts) the script?
local uuid = require "uuid"
local cjson = require "cjson.safe"
local string = require "string"
local http = require "socket_http"

function AssembledHTTPReqHeaders()
    local rl = HttpGetRequestLine()
    local header = HttpGetRawRequestHeaders()
    return rl .. "\r\n" .. header
end

function AssembledHTTPRspHeaders()
    local rl = HttpGetResponseLine()

    -- Parse status code
    local rsp_code = -1
    local pre_spc = string.find(rl, " ", 1)
    local post_spc = -1
    if pre_spc ~= nil then
        post_spc = string.find(rl, " ", pre_spc + 1)
        if post_spc == nil then
            post_spc = string.len(rl)
        end
        rsp_code = string.sub(rl, pre_spc + 1, post_spc - 1)
    end

    -- Rsponse header to json
    local header = cjson.encode(HttpGetResponseHeaders())
    return rsp_code, header
end

function AssembledHTTPReqBody()
    local reqb = ''
    local reqb_arr = {}
    reqb_arr, _, _ = HttpGetRequestBody()
    if reqb_arr ~= nil then
        for n, v in pairs(reqb_arr) do
            reqb = reqb .. v
        end
    end
    return reqb
end

function AssembledHTTPRspBody()
    local rspb = ''
    local rspb_arr = {}
    rspb_arr, _, _ = HttpGetResponseBody()
    if rspb_arr ~= nil then
        for n, v in pairs(rspb_arr) do
            rspb = rspb .. v
        end
    end
    return rspb
end

function init (args)
    local needs = {}
    needs["protocol"] = "http"
    return needs
end

function setup (args)
--    filename = SCLogPath() .. "http_out_by_lua.log"
--    file = assert(io.open(filename, "a"))
--    SCLogInfo("HTTP Log Filename " .. filename)
--    http_cnt = 0
end

function log(args)
    local req = {}
    local rsp = {}
    local status_code = 200
    local code = 200
    local xxx = ""
    local post_rsp = {}

    local id = uuid.new() .. "_" .. HttpGetRequestHost()

    -- Construct request information for reporting
    req["id"] = id
    req["header"] = AssembledHTTPReqHeaders()
    req["body"] = AssembledHTTPReqBody()

    local req_json = cjson.encode(req)
    local req_url = ""
    _, status_code, _, post_rsp = http.post(req_url, req_json)
    if status_code ~= 200 then
        SCLogError("Request id: " .. id .. ", sending request and the returned status code is: " .. status_code .. ", the response body is: " .. post_rsp[1])
        return
    end

    -- Construct response information for reporting based on the response of upper reporting request
    local rsp_body = cjson.decode(post_rsp[1])
    if rsp_body and rsp_body ~= cjson.null and rsp_body ~= "" then
        code = rsp_body["code"]
        if code == nil or code == 0 then
            -- Do not report response information in case the 'code' in the response is 0
            return
        end

        local rsp_body_data = rsp_body["data"]
        if rsp_body_data and rsp_body_data ~= cjson.null then
            -- Adding 'xxx' from request reporting to response reporting
            xxx = rsp_body_data["xxx"]
        end
    else
        SCLogError("Return body is empty")
        return
    end

    local rsp_status_code, headers = AssembledHTTPRspHeaders()
    if rsp_status_code == -1 then
        SCLogError("Response line parse failed, unable to get the status code: " .. headers)
    end
    rsp["id"] = id
    rsp["header"] = headers
    rsp["code"] = tonumber(code)
    rsp["body"] = AssembledHTTPRspBody()
    rsp["xxx"] = xxx
    rsp["status_code"] = tonumber(rsp_status_code)

    local rsp_json = cjson.encode(rsp)
    local rsp_url = ""
    _, status_code, _, post_rsp = http.post(rsp_url, rsp_json) 
    if status_code ~= 200 then
        SCLogError("Request id: " .. id .. ", sending response and the returned status code is: " .. status_code .. ", the response body is: " .. post_rsp[1])
        return
    end
end

function deinit (args)
--    SCLogInfo ("HTTP transactions logged: " .. http_cnt);
--    file:close(file)
end
  1. Only 1 thread for capture is not much, but on low traffic it should be okay
    Yes there is only one worker thread at present.

I need to wrap the http request content and send the wrapped content to a server, based on the response from the server to decide whether to wrap and send the corresponding http response. So i didn’t use coroutine while sending each of them.

Seems I’m not able to utilize lua coroutine directly since the interface functions(log) is called per request&response?

Did you try to install Hyperscan? This made a huge difference for me on AWS.
Note that I had to reinstall Suricata with the following line:

sudo ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --enable-nfqueue --enable-lua --with-libhs-includes=/usr/local/include/hs/ --with-libhs-libraries=/usr/local/lib/

Not yet. I mainly use the packet capture and filter function of suricata. I thought Hyperscan mainly help on the rule matching phase since they’re generally regex matching operation. But in my case would it also help improving the performance? I will have a try.

Correct, so you can utilize the free CPU usage for something else.

Have you checked the spread on the IRQ of your capturing interfaces or tried checking the core affinity ?