-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: development of Loggly logging plugin #6113
feat: development of Loggly logging plugin #6113
Conversation
apisix/plugins/slslog/rfc5424.lua
Outdated
local _M = { version = 0.1 } | ||
local _M = { | ||
version = 0.1, | ||
severity = Severity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need to update this code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one PR for one thing. so I think we need to revert this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the code review and the concern. The current PR need to access the rfc5424 severity map to decode alert status into integer numbers.
local message = {
-- facility LOG_USER - random user level message
"<".. tostring(8 + severity[conf.severity]) .. ">1", -- 1
timestamp, -- timestamp
ctx.var.host or "-", -- hostname
"apisix", -- appname
ctx.var.pid, -- proc-id
"-", -- msgid
"[" .. conf.customer_token .. "@41058 " .. tab_concat(taglist, " ") .. "]",
json_str
}
I have made the field exported here from that module. Okay, reverting the changes back with a local copy of that severity map inside the plugin. Thank you
Signed-off-by: Bisakh Mondal <[email protected]>
Signed-off-by: Bisakh Mondal <[email protected]>
Signed-off-by: Bisakh Mondal <[email protected]>
Signed-off-by: Bisakh Mondal <[email protected]>
Signed-off-by: Bisakh Mondal <[email protected]>
Signed-off-by: Bisakh Mondal <[email protected]>
6b8fe38
to
a3e4559
Compare
Signed-off-by: Bisakh Mondal <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"debug", "info", "notice", "warning", "err", "crit", "alert", "emegr"}, | ||
description = "base severity log level", | ||
}, | ||
include_req_body = {type = "boolean", default = false}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we refactor it to avoid adding the same group of fields to each logger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And you miss to call log_util.collect_body(conf, ctx)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we refactor it to avoid adding the same group of fields to each logger?
Yes, similar to what we did to batchprocessor. Let's handle that in the next PR.
RE collect_body: Good catch. Adding, Thanks
apisix/plugins/loggly.lua
Outdated
entry = log_util.get_full_log(ngx, conf) | ||
end | ||
|
||
if conf.prefer_name then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we don't need this as we support customizing format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack.
local metadata = plugin.plugin_metadata(plugin_name) | ||
local entry | ||
|
||
if metadata and metadata.value.log_format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's refactor it in the next PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
Yes definitely. Do you want this to be handled in this PR or as usual a small PR with a readme update? |
apisix/plugins/loggly.lua
Outdated
local function handle_log(entries) | ||
local ok, err | ||
for i = 1, #entries do | ||
ok, err = send_data_over_udp(entries[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to check the return value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if in case, the batch processor fails to process the entries (network error or something like that) the batch will be retried with a retry policy.
ref:
apisix/apisix/utils/batch-processor.lua
Lines 66 to 89 in e9fab8b
function execute_func(premature, self, batch) | |
if premature then | |
return | |
end | |
local ok, err = self.func(batch.entries, self.batch_max_size) | |
if not ok then | |
core.log.error("Batch Processor[", self.name, | |
"] failed to process entries: ", err) | |
batch.retry_count = batch.retry_count + 1 | |
if batch.retry_count <= self.max_retry_count then | |
schedule_func_exec(self, self.retry_delay, | |
batch) | |
else | |
core.log.error("Batch Processor[", self.name,"] exceeded ", | |
"the max_retry_count[", batch.retry_count, | |
"] dropping the entries") | |
end | |
return | |
end | |
core.log.debug("Batch Processor[", self.name, | |
"] successfully processed the entries") | |
end |
However, seeing your comment make me think that we can migrate to a fail-fast approach. It has other benefits.
for i = 1, #entries do
local ok, err = send_data_over_udp(entries[i])
if not ok then
return false, err
end
end
return true
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look like we need to remove the sent entries? Otherwise, they will be retried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's an issue here. The way batchprocessor is written, it is meant for consuming each batch in one go (transactional). But in the case of loggly, only bulk events can be sent through HTTP endpoints. So I can think of three potential resolution
- if err and i==1 -> then return false else true | if some entries has been sent that's ok. we are loosing the rest of the entries. [some entries will be dropped]
- use the current behaviour. where some entries might be retried again [duplocation].
- Update batch processor to have this fine-grained control
apisix/apisix/utils/batch-processor.lua
Lines 66 to 89 in a813b5a
function execute_func(premature, self, batch) if premature then return end local ok, err = self.func(batch.entries, self.batch_max_size) if not ok then core.log.error("Batch Processor[", self.name, "] failed to process entries: ", err) batch.retry_count = batch.retry_count + 1 if batch.retry_count <= self.max_retry_count then schedule_func_exec(self, self.retry_delay, batch) else core.log.error("Batch Processor[", self.name,"] exceeded ", "the max_retry_count[", batch.retry_count, "] dropping the entries") end return end core.log.debug("Batch Processor[", self.name, "] successfully processed the entries") end
update
local ok, err = self.func(batch.entries, self.batch_max_size)
-- to this
local ok, err, n = self.func(batch.entries, self.batch_max_size)
-- shorten self.entries based on n
which one do you think would be good? cc @spacewander
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here n denotes the number of processed entries, and the rest of the entries should be retried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @spacewander
I suggest doing it in the next PR. |
--- grep_error_log eval | ||
qr/message received: [ -~]+/ | ||
--- grep_error_log_out eval | ||
qr/message received: <10>1 [\d\-T:.]+Z [\d.]+ apisix [\d]+ - \[token-1\@41058 ] \{"apisix_latency":[\d.]*,"client_ip":"127\.0\.0\.1","latency":[\d.]*,"request":\{"headers":\{"content-type":"application\/x-www-form-urlencoded","host":"127\.0\.0\.1:1984","user-agent":"lua-resty-http\/[\d.]* \(Lua\) ngx_lua\/[\d]*"\},"method":"GET","querystring":\{\},"size":[\d]+,"uri":"\/opentracing","url":"http:\/\/127\.0\.0\.1:1984\/opentracing"\},"response":\{"headers":\{"connection":"close","content-type":"text\/plain","server":"APISIX\/[\d.]+","transfer-encoding":"chunked"\},"size":[\d]*,"status":200\},"route_id":"1","server":\{"hostname":"[ -~]*","version":"[\d.]+"\},"service_id":"","start_time":[\d]*,"upstream":"127\.0\.0\.1:1982","upstream_latency":[\d]*\}/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to configure include_resp_body_expr
& include_resp_body
to get the body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also added case for metadata.log_format
Signed-off-by: Bisakh Mondal [email protected]
What this PR does / why we need it:
#6112
Pre-submission checklist: