-
Notifications
You must be signed in to change notification settings - Fork 58
feat: add perf-counter for backup request #419
Conversation
src/dist/replication/lib/replica.cpp
Outdated
@@ -166,6 +170,8 @@ void replica::on_client_read(dsn::message_ex *request) | |||
response_client_read(request, ERR_INVALID_STATE); | |||
return; | |||
} | |||
} else { | |||
_counter_table_level_backup_request_qps->increment(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't (or can't) distinguish request types(get, multi_get..)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to distinguish get/multi_get, we should add the perf-counter in pegasus, not rdsn. But in pegasus, we can't get whether the request is backup request or not.
There is still one way to implement it, just like the way table level latency
use. I think it is too complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if can implement, we should consider whether to distinguish request types, and I think it is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it is necessary? I think what we need is the total qps of backup request(which means the added server load), but the request type is not very important.
src/dist/replication/test/replica_test/unit_test/replica_test.cpp
Outdated
Show resolved
Hide resolved
Co-Authored-By: Wu Tao <[email protected]>
This new counter is still not enough. If the purpose is to prevent So on collector, we can aggregate the backup request QPS of all servers. This QPS can be compared with the read-QPS on primary, to calculate the "backup request ratio". If the ratio is larger than 50%, we can suggest the user reduce the delay time. For now, the collector collects metrics only on primaries. We still need some works on the collector. |
Can falcon do these work? |
@levy5307 No such method currently. |
Add a perf-counter to statistics the qps of backup request for each app.
Related issue: apache/incubator-pegasus#251