Skip to content

Commit

Permalink
Merge pull request #81 from yaauie/encode-performance
Browse files Browse the repository at this point in the history
Improve encoding performance
  • Loading branch information
yaauie committed Apr 9, 2020
2 parents ca80e3e + 9b74838 commit 2a8edcd
Show file tree
Hide file tree
Showing 3 changed files with 32 additions and 35 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
## 6.1.1
- Improved encoding performance, especially when encoding many extension fields [#81](https://github.com/logstash-plugins/logstash-codec-cef/pull/81)
- Fixed CEF short to long name translation for ahost/agentHostName field, according to documentation [#75](https://github.com/logstash-plugins/logstash-codec-cef/pull/75)

## 6.0.1
Expand Down
64 changes: 30 additions & 34 deletions lib/logstash/codecs/cef.rb
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,28 @@ class LogStash::Codecs::CEF < LogStash::Codecs::Base
# Cache of a scanner pattern that _captures_ extension field key/value pairs
EXTENSION_KEY_VALUE_SCANNER = /(#{EXTENSION_KEY_PATTERN})=(#{EXTENSION_VALUE_PATTERN})\s*/

##
# @see CEF#sanitize_header_field
HEADER_FIELD_SANITIZER_MAPPING = {
"\\" => "\\\\",
"|" => "\\|",
"\n" => " ",
"\r" => " ",
}
HEADER_FIELD_SANITIZER_PATTERN = Regexp.union(HEADER_FIELD_SANITIZER_MAPPING.keys)
private_constant :HEADER_FIELD_SANITIZER_MAPPING, :HEADER_FIELD_SANITIZER_PATTERN

##
# @see CEF#sanitize_extension_val
EXTENSION_VALUE_SANITIZER_MAPPING = {
"\\" => "\\\\",
"=" => "\\=",
"\n" => "\\n",
"\r" => "\\n",
}
EXTENSION_VALUE_SANITIZER_PATTERN = Regexp.union(EXTENSION_VALUE_SANITIZER_MAPPING.keys)
private_constant :EXTENSION_VALUE_SANITIZER_MAPPING, :EXTENSION_VALUE_SANITIZER_PATTERN

CEF_PREFIX = 'CEF:'.freeze

public
Expand Down Expand Up @@ -348,51 +370,25 @@ def encode(event)
# Escape pipes and backslashes in the header. Equal signs are ok.
# Newlines are forbidden.
def sanitize_header_field(value)
output = String.new

value = value.to_s.gsub(/\r\n/, "\n")

value.each_char{|c|
case c
when "\\", "|"
output << "\\#{c}"
when "\n", "\r"
output << " "
else
output << c
end
}

return output
value.to_s
.gsub("\r\n", "\n")
.gsub(HEADER_FIELD_SANITIZER_PATTERN, HEADER_FIELD_SANITIZER_MAPPING)
end

# Keys must be made up of a single word, with no spaces
# must be alphanumeric
def sanitize_extension_key(value)
value = value.to_s.gsub(/[^a-zA-Z0-9]/, "")
return value
value.to_s
.gsub(/[^a-zA-Z0-9]/, "")
end

# Escape equal signs in the extensions. Canonicalize newlines.
# CEF spec leaves it up to us to choose \r or \n for newline.
# We choose \n as the default.
def sanitize_extension_val(value)
output = String.new

value = value.to_s.gsub(/\r\n/, "\n")

value.each_char{|c|
case c
when "\\", "="
output << "\\#{c}"
when "\n", "\r"
output << "\\n"
else
output << c
end
}

return output
value.to_s
.gsub("\r\n", "\n")
.gsub(EXTENSION_VALUE_SANITIZER_PATTERN, EXTENSION_VALUE_SANITIZER_MAPPING)
end

def get_value(fieldname, event)
Expand Down
2 changes: 1 addition & 1 deletion logstash-codec-cef.gemspec
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Gem::Specification.new do |s|

s.name = 'logstash-codec-cef'
s.version = '6.1.0'
s.version = '6.1.1'
s.platform = 'java'
s.licenses = ['Apache License (2.0)']
s.summary = "Reads the ArcSight Common Event Format (CEF)."
Expand Down

0 comments on commit 2a8edcd

Please sign in to comment.