-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ovs meter metric #5165
Add ovs meter metric #5165
Conversation
@mengdie-song please add unit test for this PR, the patch unit test coverage is quite low. |
bf22d1e
to
8ee19cb
Compare
8ee19cb
to
257cab9
Compare
6690e28
to
63bdabb
Compare
63bdabb
to
f7ee739
Compare
test/e2e/prometheus_test.go
Outdated
@@ -44,6 +44,7 @@ var antreaAgentMetrics = []string{ | |||
"antrea_agent_ovs_flow_ops_error_count", | |||
"antrea_agent_ovs_flow_ops_latency_milliseconds", | |||
"antrea_agent_ovs_total_flow_count", | |||
"antrea_agent_ovs_ovs_meter_packet_count", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it this correct? two "ovs"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, thanks. It seems that prometheus e2e test is disabled by default on CI. I will double check this one and update.
f7ee739
to
79b6a38
Compare
79b6a38
to
a05c1fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go CI jobs are failing because of build errors
a05c1fc
to
598c43b
Compare
handleMeterStatsReply := func(packetCounts map[int]int64) { | ||
for k, v := range packetCounts { | ||
switch k { | ||
case 1: | ||
metrics.OVSMeterPacketDroppedCount.WithLabelValues("PacketInMeterNetworkPolicy").Set(float64(v)) | ||
case 2: | ||
metrics.OVSMeterPacketDroppedCount.WithLabelValues("PacketInMeterTraceflow").Set(float64(v)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we are using Prometheus metrics at all in this unit test. This handleMeterStatsReply
callback function does not need to use metrics. This file is testing the pkg/ovs/openflow
package, which has nothing to do with metrics. Instead you could use a map protected by a mutex, or a channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the part of checking Prometheus metrics and only checked the packet count in the latest change. Could you help take a look at the latest version?
598c43b
to
7d5b606
Compare
7d5b606
to
8f43cda
Compare
8f43cda
to
f15d5f7
Compare
} | ||
b.mpReplyChsMutex.RUnlock() | ||
b.MultipartReply(sw, mpMeterStatsReply) | ||
time.Sleep(time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nondeterministic and may be flaky. And having the assertion in the callback can't guarantee it will be called. Even if the tested function does nothing, the test would't fail.
Perhaps you can use a map to collect the received stats and use assert.Eventually
to check whether expected stats are received in a reasonable time.
f15d5f7
to
259977e
Compare
packetCounts := make(map[int]int64) | ||
checkMeterStatsReply := func(meterId int, packetCount int64) { | ||
packetCounts[meterId] = packetCount | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs mutex or you can use:
var receivedCount1 int64
checkMeterStatsReply := func(meterID int, packetCount int64) {
assert.Equal(t, 1, meterID)
atomic.StoreInt64(&receivedCount1, packetCount)
}
...
assert.Eventually(t, func() bool {
return int64(100) == atomic.LoadInt64(&receivedCount1)
}, ...
Note it's not expected to receive other meterID 2 given your code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I would like to check both meters and have added the mutex.
259977e
to
c481770
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
No comment from my side, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We have implemented rate-limiting for packet-in messages on NetworkPolicy audit logging and Traceflow. This change adds a metric to show the packet count which is got from meter statistics. A separate goroutine is used here to get the statistics every 30 seconds and collect the metric. The value more than 0 indicates that current rate exceeds predefined limit(100 per second). Fixes: antrea-io#5037 Signed-off-by: Mengdie Song <[email protected]>
c481770
to
417dd54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
As @antoninbas's last comment has been addressed, I'm going to merge the PR given the tight schedule. If there is any new comment after it's merged, we can follow up with new PR. |
We have implemented rate-limiting for packet-in messages on NetworkPolicy audit logging and Traceflow.
This change adds a metric to show the packet count which is got from meter statistics. A separate goroutine is used here to get the statistics every 30 seconds and collect the metric. The value more than 0 indicates that current rate exceeds predefined limit(100 per second).
Fixes: #5037
Signed-off-by: Mengdie Song [email protected]