Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add domain name patterns to the rule language #7

Merged
13 commits merged into from
Jan 19, 2016
Merged

Conversation

ghost
Copy link

@ghost ghost commented Jan 14, 2016

This pull request implements a specialized support for matching domain names in the rule language.

The motivation for this addition comes from repeatedly seeing domain name matching done with regular expressions, which is can be tedious and error-prone. These changes do not address matching URLs based on their hostnames, which also seems to happen quite a bit.

Basics

A simple example of a rule with a domain name pattern is "host in domain.example". This rule would match events whose host key has a value that looks like the domain name domain.example or any of its subdomains. Therefore events such as host=domain.example and host=deep.sub.domain.example would match.

A domain name is composed on labels. Each label in a pattern must be a valid domain name label (for further elaboration have a look at doctests in abusehelper.core.rules._domainname).

One notable - and intentional limitation - is that a domain name pattern or an event value has to contain at least two labels to be taken into consideration. This is to avoid counting every single event value (i.e. "malware") as a potential domain name to run through the whole normalization & matching dance. In practice this means that domain name pattern "host in com" is not valid and event host=com will not match to any domain name pattern.

Wildcards

There is a special case though. A domain name pattern can begin with 0-n wildcard labels (*****). A wildcard label matches to any label, but there has to be a label to match. For example rule "host in ..com" matches to events host=domain.example.com and host=sub.domain.example.com, but not host=example.com.

Some limitations apply. A wildcard label can only contain the , so no patterns like "test.example" or "****.example". The wildcards can be only located in the beginning of the pattern (so no "test.*.example") and a pattern must contain at least one non-wildcard label (so no "..***").

IDNA support & normalization

Patterns and event values can contain internationalized domain names such as äää.example.com. Also its IDNA encoded form xn--4caaa.example.com will be considered equal. In addition domain name matching happens case-insensitively, so ÄÄÄ.example.COM will be considered equal as well.

Usage

A domain name pattern can be used in same contexts as IP blocks currently can. So rules like "* in domain.example" and "host not in domain.example" and the fuzzy rule "*.sub.domain.example" all work.

@ghost ghost added bug enhancement and removed bug labels Jan 14, 2016
@ghost ghost self-assigned this Jan 14, 2016
@ghost ghost changed the title [WIP] Add domain name patterns to the rule language Add domain name patterns to the rule language Jan 14, 2016
@ghost ghost added this to the 3.0.0 milestone Jan 14, 2016
ghost pushed a commit that referenced this pull request Jan 19, 2016
Add domain name patterns to the rule language
@ghost ghost merged commit b8a7135 into master Jan 19, 2016
ghost pushed a commit that referenced this pull request Jan 19, 2016
@ghost ghost deleted the feature-domainname-rules branch January 19, 2016 16:33
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant