-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add domain name patterns to the rule language #7
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add abusehelper.core.rules._domainname module, collecting domain name and domain name pattern handling utilities.
Domain names are now expressed just as sequences (lists, tuples) of string labels.
ghost
self-assigned this
Jan 14, 2016
ghost
changed the title
[WIP] Add domain name patterns to the rule language
Add domain name patterns to the rule language
Jan 14, 2016
ghost
added this to the 3.0.0 milestone
Jan 14, 2016
ghost
pushed a commit
that referenced
this pull request
Jan 19, 2016
Add domain name patterns to the rule language
ghost
pushed a commit
that referenced
this pull request
Jan 19, 2016
ghost
deleted the
feature-domainname-rules
branch
January 19, 2016 16:33
This pull request was closed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request implements a specialized support for matching domain names in the rule language.
The motivation for this addition comes from repeatedly seeing domain name matching done with regular expressions, which is can be tedious and error-prone. These changes do not address matching URLs based on their hostnames, which also seems to happen quite a bit.
Basics
A simple example of a rule with a domain name pattern is "host in domain.example". This rule would match events whose
host
key has a value that looks like the domain namedomain.example
or any of its subdomains. Therefore events such ashost=domain.example
andhost=deep.sub.domain.example
would match.A domain name is composed on labels. Each label in a pattern must be a valid domain name label (for further elaboration have a look at doctests in abusehelper.core.rules._domainname).
One notable - and intentional limitation - is that a domain name pattern or an event value has to contain at least two labels to be taken into consideration. This is to avoid counting every single event value (i.e. "malware") as a potential domain name to run through the whole normalization & matching dance. In practice this means that domain name pattern "host in com" is not valid and event
host=com
will not match to any domain name pattern.Wildcards
There is a special case though. A domain name pattern can begin with 0-n wildcard labels (*****). A wildcard label matches to any label, but there has to be a label to match. For example rule "host in ..com" matches to events
host=domain.example.com
andhost=sub.domain.example.com
, but nothost=example.com
.Some limitations apply. A wildcard label can only contain the , so no patterns like "test.example" or "****.example". The wildcards can be only located in the beginning of the pattern (so no "test.*.example") and a pattern must contain at least one non-wildcard label (so no "..***").
IDNA support & normalization
Patterns and event values can contain internationalized domain names such as
äää.example.com
. Also its IDNA encoded formxn--4caaa.example.com
will be considered equal. In addition domain name matching happens case-insensitively, soÄÄÄ.example.COM
will be considered equal as well.Usage
A domain name pattern can be used in same contexts as IP blocks currently can. So rules like "* in domain.example" and "host not in domain.example" and the fuzzy rule "*.sub.domain.example" all work.