Is there a generic way in suricata to match on stuff from a file? It’s possible for IPs and FQDNs as far as I know.
I have multiple files containing indicators, as follows:
Domains:
How do I get alerted on domains from a file, matching on a field (eg: dns_query) with any subdomain? For example if I have google.com listed in the file, can I get alerts on images.google.com in dns_query field?
Example file domains.txt would contain:
google.com
facebook.com
apple.com
URLs:
Same for domains above, but for URLs or URIs. How do I get alerted on them in HTTP traffic for example?
Example file urls.txt would contain:
/example/url.php
/example2/script.js
Any guidance is appreciated. Thanks a lot for your help.
That looks like an exact match, but how do I use a data entry (eg: google.com.uk) from a dataset in my rule and manipulate it to match random subdomains (eg: random.subdomain.google.com.uk)
Ah, I see. Should have read your question more carefully
Indeed, generic subdomain matching won’t work with datasets, only if you can enumerate all potential ‘interesting’ subdomains and explicitly list them in the dataset as well.
Otherwise, if you only have a few thousand domains, you could autogenerate rules that use each domain to match against the dns.query buffer with an endswith keyword.
dataset.rules: alert http any any -> any any (msg:"HTTP URI en lista http-uri IOCs ";http.uri;dataset:isset,http-uri;classtype:external-ip-check;sid:696974;)
you can do this by applying a pcrexform transformation on the sticky buffer you want to alert on (eg: dns.query, tls.sni…etc) and then matching against the IOCs datasets
The pcrexform transformation is a regex match with a regex capture expression in it, the first capture group match will be the output that will be compared against the values in your IOCs DataSet.
so the easies thing is to do is to extract the last portion of the domain domain using regex, example:
alert dns any any -> any any (msg:”dns IOCs match”; dns.query; pcrexform:".+\.([-a-zA-Z9-9]+\.[-a-zA-Z9-9]+\.[-a-zA-Z9-9]+)"; dataset:isset,dns-iocs; sid:123; rev:1;)
the above regex was a quick one, you can find more robust implementations on the internet, but it should work.
Interesting. I was not aware of the pcrexform keyword.
Do you know how bad the performance hit is? I would imagine it hurts quite a bit since you would skip MPM and the PCRE would run on each dns query? Would be interesting to compare to say 10k dns.query rules with non-matching text in the content keyword.