feat: add perf-counter for backup request #419

levy5307 · 2020-03-18T05:44:03Z

Add a perf-counter to statistics the qps of backup request for each app.

Related issue: apache/incubator-pegasus#251

foreverneverer · 2020-03-18T06:29:51Z

src/dist/replication/lib/replica.cpp

@@ -166,6 +170,8 @@ void replica::on_client_read(dsn::message_ex *request)
            response_client_read(request, ERR_INVALID_STATE);
            return;
        }
+    } else {
+        _counter_table_level_backup_request_qps->increment();


don't (or can't) distinguish request types(get, multi_get..)?

If we need to distinguish get/multi_get, we should add the perf-counter in pegasus, not rdsn. But in pegasus, we can't get whether the request is backup request or not.
There is still one way to implement it, just like the way table level latency use. I think it is too complicated.

Well, if can implement, we should consider whether to distinguish request types, and I think it is necessary.

Why it is necessary? I think what we need is the total qps of backup request(which means the added server load), but the request type is not very important.

src/dist/replication/test/replica_test/unit_test/replica_test.cpp

Co-Authored-By: Wu Tao <[email protected]>

neverchanje · 2020-03-19T02:50:26Z

This new counter is still not enough. If the purpose is to prevent backup_request_delay_ms configured too low to actually reduce the tail latency, we need to observe the "additional load" at table-level.

So on collector, we can aggregate the backup request QPS of all servers. This QPS can be compared with the read-QPS on primary, to calculate the "backup request ratio". If the ratio is larger than 50%, we can suggest the user reduce the delay time.

For now, the collector collects metrics only on primaries. We still need some works on the collector.

src/dist/replication/lib/replica.cpp

levy5307 · 2020-03-19T03:02:19Z

This new counter is still not enough. If the purpose is to prevent backup_request_delay_ms configured too low to actually reduce the tail latency, we need to observe the "additional load" at table-level.

So on collector, we can aggregate the backup request QPS of all servers. This QPS can be compared with the read-QPS on primary, to calculate the "backup request ratio". If the ratio is larger than 50%, we can suggest the user reduce the delay time.

For now, the collector collects metrics only on primaries. We still need some works on the collector.

Can falcon do these work?

neverchanje · 2020-03-19T05:01:14Z

@levy5307 No such method currently.

zhaoliwei added 3 commits March 18, 2020 12:42

add perf counter

8859f61

fix

a9792c5

Merge branch 'master' into backup-request-perf-counter

61e27d2

foreverneverer reviewed Mar 18, 2020

View reviewed changes

neverchanje reviewed Mar 19, 2020

View reviewed changes

src/dist/replication/test/replica_test/unit_test/replica_test.cpp Outdated Show resolved Hide resolved

levy5307 and others added 2 commits March 19, 2020 09:59

Update src/dist/replication/test/replica_test/unit_test/replica_test.cpp

0fd9eaa

Co-Authored-By: Wu Tao <[email protected]>

fix

56a7e05

neverchanje reviewed Mar 19, 2020

View reviewed changes

src/dist/replication/lib/replica.cpp Outdated Show resolved Hide resolved

fix

7070f8f

Merge branch 'master' into backup-request-perf-counter

3f6d870

acelyc111 approved these changes Mar 19, 2020

View reviewed changes

neverchanje approved these changes Mar 19, 2020

View reviewed changes

levy5307 merged commit 2ed6d36 into XiaoMi:master Mar 19, 2020

neverchanje mentioned this pull request May 14, 2020

Release 2.0.0 apache/incubator-pegasus#536

Closed

neverchanje added the type/perf-counter PR that made modification on perf-counter, which should be noted in release note. label May 14, 2020

levy5307 deleted the backup-request-perf-counter branch May 26, 2020 06:03

neverchanje added the 2.0.0 label Jun 5, 2020

neverchanje mentioned this pull request Jun 10, 2020

Release 1.12.4 apache/incubator-pegasus#547

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add perf-counter for backup request #419

feat: add perf-counter for backup request #419

levy5307 commented Mar 18, 2020 •

edited by neverchanje

Loading

foreverneverer Mar 18, 2020 •

edited

Loading

levy5307 Mar 18, 2020 •

edited

Loading

foreverneverer Mar 18, 2020 •

edited

Loading

levy5307 Mar 18, 2020

neverchanje commented Mar 19, 2020 •

edited

Loading

levy5307 commented Mar 19, 2020

neverchanje commented Mar 19, 2020

feat: add perf-counter for backup request #419

feat: add perf-counter for backup request #419

Conversation

levy5307 commented Mar 18, 2020 • edited by neverchanje Loading

foreverneverer Mar 18, 2020 • edited Loading

Choose a reason for hiding this comment

levy5307 Mar 18, 2020 • edited Loading

Choose a reason for hiding this comment

foreverneverer Mar 18, 2020 • edited Loading

Choose a reason for hiding this comment

levy5307 Mar 18, 2020

Choose a reason for hiding this comment

neverchanje commented Mar 19, 2020 • edited Loading

levy5307 commented Mar 19, 2020

neverchanje commented Mar 19, 2020

levy5307 commented Mar 18, 2020 •

edited by neverchanje

Loading

foreverneverer Mar 18, 2020 •

edited

Loading

levy5307 Mar 18, 2020 •

edited

Loading

foreverneverer Mar 18, 2020 •

edited

Loading

neverchanje commented Mar 19, 2020 •

edited

Loading