This plugin for Fluentd allows extracting a single key from an existing record and re-parsing it with a supplied format. A new event is then emitted, with the record modified by the now-decoded key's value.
out_burrow is designed to allow post-facto re-parsing of nested key elements.
For example, lets say your source application writes to syslog, but instead of plain string messages, it writes JSON encoded data. /var/log/syslog contains the following entry:
Jun 17 21:16:22 app1 5012162: {"event":"csrf_failure","msg":"Invalid transient key for System.","username":"System","userid":"1","ip":"192.34.93.74","method":"GET","domain":"http://timgunter.ca","path":"/dashboard/settings","tags":["csrf","failure"],"accountid":5009392,"siteid":5012162}
In td-agent.conf, you might have something like this to read this event:
<source>
type syslog
port 5140
bind 127.0.0.1
tag raw.app.vanilla.events
</source>
Unfortunately, in_syslog does not understand that the message
field is encoded with JSON, so it escapes all the data
and makes it unusable down the line. If we piped these events to a file, we would see something like this:
2014-06-17T21:16:22Z raw.app.vanilla.events.local0.err {"host":"app1","ident":"5012162","message":"{\"event\":\"csrf_failure\",\"msg\":\"Invalid transient key for System.\",\"username\":\"System\",\"userid\":\"1\",\"ip\":\"192.34.93.74\",\"method\":\"GET\",\"domain\":\"http://timgunter.ca\",\"path\":\"/dashboard/authentication\",\"tags\":[\"csrf\",\"failure\"],\"accountid\":5009392,\"siteid\":5012162}"}
Note how the message
field has been escaped. This means that when this event eventually makes its way to a file, or
another system (like elasticsearch for example), it will not be ready for consumption. That's where out_burrow
comes in.
Adding the following match
block to td-agent.conf allows us to intercept the raw syslog events and re-parse the
message field as JSON:
<match raw.app.vanilla.events.**>
type burrow
key_name message
action inplace
remove_prefix raw
format json
</match>
There are several components to this rule, but for now lets look at the output:
2014-06-17T21:16:23Z app.vanilla.events.local0.err {"host":"app1","ident":"5012162","message":{"event":"csrf_failure","msg":"Invalid transient key for System.","username":"System","userid":"1","ip":"192.34.93.74","method":"GET","domain":"http://timgunter.ca","path":"/dashboard/settings/mobilethemes","tags":["csrf","failure"],"accountid":5009392,"siteid":5012162}}
Now the JSON is no longer escaped, and can be easily parsed by both fluentd and elasticsearch.
required
This is the name of the key we want to examine and re-parse, and is required.
required
This is format that Fluentd should expect the key_name
field to be encoded with. out_burrow supports the same built-in
format as Fluent::TextParser (and in_tail):
- apache
- apache2
- nginx
- syslog
- json
- csv
- tsv
- ltsv
optional
When this event is re-emitted, change its tag to this setting's value.
optional
When this event is re-emitted, remove this prefix from the source tag and use the resulting string as the new event's
tag. This setting automatically adds a trailing period .
to its value before stripping.
optional
When this event is re-emitted, prepend this prefix to the source tag and use the resulting string as the new event's tag.
This setting automatically adds a trailing period .
to its value before prepending.
One of the tag
, remove_prefix
, or add_prefix
settings is required.
remove_prefix
and add_prefix
can co-exist together.
optional
and defaults to inplace
The value of this setting determines how the new event will be constructed. There are three distinct options here:
- inplace
Perform decoding 'in place'. When the key_name
field is successfully parsed, its contents will be written back to its
original key in the original record, which will then be re-emitted.
- overlay
Overlay decoded data on top of original record, and re-emit. In our example above, if 'overlay' was used instead of 'inplace', the resulting record would have been:
{
"host":"app1",
"ident":"5012162",
"event":"csrf_failure",
"msg":"Invalid transient key for System.",
"username":"System",
"userid":"1",
"ip":"192.34.93.74",
"method":"GET",
"domain":"http://timgunter.ca",
"path":"/dashboard/settings",
"tags":["csrf","failure"],
"accountid":5009392,
"siteid":5012162
}
- replace
Replace the original entirely with the contents of the decoded field. In our example above, if 'replace' was used instead of 'inplace', the resulting record would have been:
{
"event":"csrf_failure",
"msg":"Invalid transient key for System.",
"username":"System",
"userid":"1",
"ip":"192.34.93.74",
"method":"GET",
"domain":"http://timgunter.ca",
"path":"/dashboard/settings",
"tags":["csrf","failure"],
"accountid":5009392,
"siteid":5012162
}
- prefix
Insert the decoded data in the original record, using the data_prefix
key. In our example above, if 'prefix' was used
instead of 'inplace' with 'data_prefix' = 'data', the resulting record would have been:
{
"host":"app1",
"ident":"5012162",
"data": {
"event":"csrf_failure",
"msg":"Invalid transient key for System.",
"username":"System",
"userid":"1",
"ip":"192.34.93.74",
"method":"GET",
"domain":"http://timgunter.ca",
"path":"/dashboard/settings",
"tags":["csrf","failure"],
"accountid":5009392,
"siteid":5012162
}
}
required
if you use 'action' = 'prefix' (defaults to nil
)
The prefix used to insert the decoded data in the message.
optional
and defaults to false
Keep original source key (only valid with 'overlay' and 'replace' actions). When this is true
, the original encoded
source key is retained in the output.
optional
and defaults to false
Keep the original record's "time" key. If the original top level record contains a
optional
and defaults to time
If keep_time
is true
, this field specifies the key that contains the original records's time. The value of this key
will be copied into the new record after it has been parsed.
optional
and defaults to time
When the key_name
field's value is being parsed, look for this key and interpret it as the record's time
key.
optional
and defaults to nil
When parsing the key_name
field's value and if time_key
is set, this field denotes the format to expect the time
to be in.
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request
If you have a question, open an Issue.