Sherlock is an anomaly detection service built on top of Druid. It leverages EGADS (Extensible Generic Anomaly Detection System) to detect anomalies in time-series data. Users can schedule jobs on an hourly, daily, weekly, or monthly basis, view anomaly reports from Sherlock's interface, or receive them via email.
Timeseries generation is the first phase of Sherlock's anomaly detection. The user inputs a full Druid JSON query with a metric name and group-by dimensions. Sherlock validates the query, adjusts the time interaval and granularity based on the EGADS config, and makes a call to Druid. Druid responds with an array of time-series, which are parsed into EGADS time-series.
{
"metric": "metric(metric1/metric2)",
"aggregations": [
{
"filter": {
"fields": [
{
"type": "selector",
"dimension": "dim1",
"value": "value1"
}
],
"type": "or"
},
"aggregator": {
"fieldName": "metric2",
"type": "longSum",
"name": "metric2"
},
"type": "filtered"
}
],
"dimension": "groupByDimension",
"intervals": "2017-09-10T00:00:01+00:00/2017-10-12T00:00:01+00:00",
"dataSource": "source1",
"granularity": {
"timeZone": "UTC",
"type": "period",
"period": "P1D"
},
"threshold": 50,
"postAggregations": [
{
"fields": [
{
"fieldName": "metric1",
"type": "fieldAccess",
"name": "metric1"
}
],
"type": "arithmetic",
"name": "metric(metric1/metric2)",
"fn": "/"
}
],
"queryType": "topN"
}
[ {
"timestamp" : "2017-10-11T00:00:00.000Z",
"result" : [ {
"groupByDimension" : "dim1",
"metric(metric1/metric2)" : 8,
"metric1" : 128,
"metric2" : 16
}, {
"groupByDimension" : "dim2",
"metric(metric1/metric2)" : 4.5,
"metric1" : 42,
"metric2" : 9.33
} ]
}, {
"timestamp" : "2017-10-12T00:00:00.000Z",
"result" : [ {
"groupByDimension" : "dim1",
"metric(metric1/metric2)" : 9,
"metric1" : 180,
"metric2" : 20
}, {
"groupByDimension" : "dim2",
"metric(metric1/metric2)" : 5.5,
"metric1" : 95,
"metric2" : 17.27
} ]
} ]
Sherlock calls the user-configured EGADS API for each generated time-series, generates anomaly reports from the response, and stores these reports in a database. Users may also elect to receive anomaly reports by email.
Sherlock uses a Redis backend Redis to store job metadata, generated anomaly reports, among other information, and as a persistent job queue. Keys related to Reports have retention policy. Hourly job reports have retention of 14 days and daily/weekly/monthly job reports have 1 year of retention.
Sherlock's user interface is built with Spark. The UI enables users to submit instant anomaly analyses, create and launch detection jobs, view anomalies on a heatmap, and on a graph.
A Makefile
is provided with all build targets.
make jar
This creates sherlock.jar
in the target/
directory.
Sherlock is run through the commandline with config arguments.
java -Dlog4j.configuration=file:${path_to_log4j}/log4j.properties \
-jar ${path_to_jar}/sherlock.jar \
--version $(VERSION) \
--project-name $(PROJECT_NAME) \
--port $(PORT) \
--enable-email \
--failure-email $(FAILURE_EMAIL) \
--from-mail $(FROM_MAIL) \
--reply-to $(REPLY_TO) \
--smtp-host $(SMTP_HOST) \
--interval-minutes $(INTERVAL_MINUTES) \
--interval-hours $(INTERVAL_HOURS) \
--interval-days $(INTERVAL_DAYS) \
--interval-weeks $(INTERVAL_WEEKS) \
--interval-months $(INTERVAL_MONTHS) \
--egads-config-filename $(EGADS_CONFIG_FILENAME) \
--redis-host $(REDIS_HOSTNAME) \
--redis-port $(REDIS_PORT) \
--execution-delay $(EXECUTION_DELAY) \
--timeseries-completeness $(TIMESERIES_COMPLETENESS)
args | required | default | description |
---|---|---|---|
--help | - | false |
help |
--config | - | null |
config |
--version | - | v0.0.0 |
version |
--egads-config-filename | - | provided |
egads-config-filename |
--port | - | 4080 |
port |
--interval-minutes | - | 180 |
interval-minutes |
--interval-hours | - | 672 |
interval-hours |
--interval-days | - | 28 |
interval-days |
--interval-weeks | - | 12 |
interval-weeks |
--interval-months | - | 6 |
interval-months |
--enable-email | - | false |
enable-email |
--from-mail | if email enabled |
from-mail | |
--reply-to | if email enabled |
reply-to | |
--smtp-host | if email enabled |
smtp-host | |
--smtp-port | - | 25 |
smtp-port |
--smtp-user | - | smtp-user | |
--smtp-password | - | smtp-password | |
--failure-email | if email enabled |
failure-email | |
--execution-delay | - | 30 |
execution-delay |
--valid-domains | - | null |
valid-domains |
--redis-host | - | 127.0.0.1 |
redis-host |
--redis-port | - | 6379 |
redis-port |
--redis-ssl | - | false |
redis-ssl |
--redis-timeout | - | 5000 |
redis-timeout |
--redis-password | - | - | redis-password |
--redis-clustered | - | false |
redis-clustered |
--project-name | - | - | project-name |
--external-file-path | - | - | external-file-path |
--debug-mode | - | false |
debug-mode |
--timeseries-completeness | - | 60 |
timeseries-completeness |
--http-client-timeout | - | 20000 |
http-client-timeout |
--nodata-on-failure | - | false |
nodata-on-failure |
--backup-redis-db-path | - | null |
backup-redis-db-path |
--druid-brokers-list-file | - | null |
druid-brokers-list-file |
--truststore-path | - | null |
truststore-path |
--truststore-type | - | jks |
truststore-type |
--truststore-password | - | null |
truststore-password |
--keystore-path | - | null |
keystore-path |
--keystore-type | - | jks |
keystore-type |
--keystore-password | - | null |
keystore-password |
--key-dir | - | null |
key-dir |
--cert-dir | - | null |
cert-dir |
--https-hostname-verification | - | true |
https-hostname-verification |
--custom-ssl-context-provider-class | - | DefaultSslContextProvider |
custom-ssl-context-provider-class |
--custom-secret-provider-class | - | DefaultSecretProvider |
custom-secret-provider-class |
Prints commandline argument help message.
Path to a Sherlock configuration file, where the above configuration may be specified. Config arguments in the file override commandline arguments.
Version of sherlock.jar
to display on the UI
Path to a custom EGADS configuration file. If none is specified, the default configuration is used.
Port on which to host the Spark application.
Number of historic data points to use for detection on time-series every minute.
Number of historic data points to use for detection on hourly time-series.
Number of historic data points to use for detection on daily time-series.
Number of historic data points to use for detection on weekly time-series.
Number of historic data points to use for detection on monthly time-series.
Enable the email service. This enables users to receive email anomaly report notifications.
The handle's FROM
email displayed to email recipients.
The handle's REPLY TO
email where replies will be sent.
The email service's SMTP HOST
.
The email service's SMTP PORT
. The default value is 25
.
The email service's SMTP USER
.
The email service's SMTP PASSWORD
.
A dedicated email which may be set to receive job failure notifications.
Sherlock periodically pings Redis to check scheduled jobs. This sets the ping delay in seconds. Jobs are scheduled with a precision of one minute.
A comma-separated list of valid domains to receive emails, e.g. 'yahoo,gmail,hotmail'. If specified, Sherlock will restrict who may receive emails.
The Redis backend hostname.
The Redis backend port.
Whether Sherlock should connect to Redis via SSL.
The Redis connection timeout.
The password to use when authenticating to Redis.
Whether the Redis backend is a cluster.
Name of the project to display on UI.
Specify the path to external files for Spark framework via this argument.
Debug mode enables debug routes. Ex. '/DatabaseJson' (shows redis data as json dump). Look at com.yahoo.sherlock.App
for more details.
This defines minimum fraction of datapoints needed in the timeseries to consider it as a valid timeseries o/w sherlock ignores such timeseries. (default value 60 i.e. 0.6 in fraction)
HttpClient timeout can be configured using this(in millis). (default value 20000)
Specify if a job should be set to NODATA rather than ERROR following a druid query failure. (default value false)
Backup redis DB at given file path as json dump of indices and objects. Backup is done per day at midnight. Default this parameter is null i.e. no buckup. However, BGSAVE command is run at midnight to save redis local dump.
Specify the path to an access control list file of permitted druid broker hosts for querying. Format: <host1>:<port>,<host2>:<port>...
(default null i.e any host is allowed)
Path to specify truststore location for mTLS connections. (default null
)
Param to specify truststore type for mTLS connections. (default jks
)
Param to specify truststore password for mTLS connections. (default null
)
Path to specify keystore location for mTLS connections. (default null
)
Param to specify keystore type for mTLS connections. (default jks
)
Param to specify keystore password for mTLS connections. (default null
)
Param to specify key directory containing multiple keys(for different clusters) for mTLS connections (default null
).
This is used when Principal Name
is given in druid cluster form.
It looks for filename containing Principal Name
under this dir.
If --key-dir
and --cert-dir
values are same then the filename should also contain the identifier key
for private key file and cert
for public key file.
Param to specify cert directory containing multiple certs(for different clusters) for mTLS connections (default null
)."
This is used when Principal Name
is given in druid cluster form.
It looks for file name containing Principal Name
under this dir.
If --key-dir
and --cert-dir
values are same then the filename should also contain the identifier key
for private key file and cert
for public key file.
Param to enable/disable https hostname verification for mTLS connections. (default true
i.e. hostname verification enabled)
Param to specify custom ssl context provider class for mTLS connections. (default com.yahoo.sherlock.utils.DefaultSslContextProvider
which returns SSLContext with validation)
Param to specify custom secret provider class for passwords. (default com.yahoo.sherlock.utils.DefaultSecretProvider
which returns secrets specified from CLISettings)
Jigar Patel, [email protected]
Jeff Niu, [email protected]
Josh Walters, [email protected]
Stephan Stiefel, Stephan3555
Code licensed under the GPL v3 License. See LICENSE file for terms.