Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear how to use drop_result. #22

Open
ovanes opened this issue Mar 19, 2019 · 13 comments
Open

Unclear how to use drop_result. #22

ovanes opened this issue Mar 19, 2019 · 13 comments

Comments

@ovanes
Copy link

ovanes commented Mar 19, 2019

Documentation states, that it's possible to use drop_result as part of the parsing policy: https://github.com/basiliscos/cpp-bredis#parse_result_titerator-policy

However, it's pretty unclear how to use it with the Connection object.

I read through the source code and Connection seems to be have hard-coded keep_result as policy type. How would I use drop_result with Connection object?

Do I understand the intention of drop_result properly, that it'd cause the parser to verify that the response isn't an error, but the payload itself is dropped.

@basiliscos
Copy link
Owner

Thank you for your query.

Indeed, it seems the drop_result policy wasn't properly exposed for the API usage.

What is your use-case for discarding the result?

@basiliscos
Copy link
Owner

PS. Will try to fix it on weekends.

@ovanes
Copy link
Author

ovanes commented Mar 19, 2019

Thanks for the quick reply! My use case is pretty simple. I'd like to perf test redis in a similar way as redis benchmark, but with more customized control. Obviously, I'd like the client to loose as little time on parsing as possible, i.e. just to verify that the response which came back is «valid» (i.e. it's type is retrievable) and drop the remaining data. After my investigations, I see bredis as perfect accomplishment.

@basiliscos
Copy link
Owner

Could you, please, try PR #24 ?

How to use it is basically :

using Policy = r::parsing_policy::drop_result;
...
c.async_read(rx_buff, read_callback, count, Policy{}); 

the last 2 parameters are mandatory for your purpose. The count could be equal to 1 as in the defaults.

Also, please share your benchmark results.

I'm still not completely convinced that the PR should be merged, but I see another use case for drop_result policy: some background ping/redis keep-alive thread.

@ovanes
Copy link
Author

ovanes commented Mar 25, 2019

Thanks a lot for an incredible devotion and quick implementation. I gave the implementation a try and here are my thoughts:

Generally, I'd like to skip the entire result's payload but still see what the "high level result" was. This allows to make a conclusion about the response, e.g. nil_t -> miss in case of a get command. Right now it's even not possible to understand if the command type being processed was the correct one. I just tried to use "HGETALL" with a simple sting typed key/value. As a response I received an instance of specialiazied positive_parse_result with a field consumed containing 68. I understand that Redis encoded WRONG TYPE ERROR there, but there is no way to see that the response was an error at all. IMO the response type should be part of the specialized positive_parse_result instance.

@basiliscos
Copy link
Owner

What you are asking for, is some kind of "partial result drop", meanwhile the drop_result policy discards result completely, i.e. you either do not care about it at all or completely sure about it (like ping).

HGETALL, in your example, returns non-trivial result ( https://redis.io/commands/hgetall ).

So, for your, purposes, I'd suggest you not to extract results, but to scan the existing markers, like that:

template <typename Iterator>
class not_error : public boost::static_visitor<bool> {

  public:

    template <typename T> bool operator()(const T &value) const {
        return true;
    }
    bool operator()(const markers::error_t<Iterator> &value) const {
        return false;
    }
};
...

c.async_read(rx_buff, [&](const auto &error_code, result_t &&r) {
  auto success = boost::apply_visitor(not_error<Iterator>(), r.result);
  if (!success) std::abort();
});

The markers will still be allocated, but they are quite light-weight.

I'll think about possibility to inject custom on-flight parsing policy, but that's surely will be non-trivial.

@ovanes
Copy link
Author

ovanes commented Mar 25, 2019

Ivan thanks a lot for your explanation.

This is exactly what I was looking for. Over the next few days I'll gather some data, to give you insights on performance benefits (if there are any). I'll post it in this thread here.

Regarding HGETALL: I probably explained it the wrong way. I used this command with a String typed Redis key/value. This was a test to understand how the error_t is reported, when using drop_result policy.

@basiliscos
Copy link
Owner

@ovanes any news, so far?

I have updated performance testing against Redos, and here are my results:

   bredis (commands/s) | bredis(*) (commands/s) | redox (commands/s)
  ---------------------+------------------------+--------------------
        1.59325e+06    |      2.50826e+06       |    0.999375+06

where (*) are results with drop_result policy.

@ovanes
Copy link
Author

ovanes commented Apr 14, 2019

Sorry for the delay.

Below are my findings:

Test Setup

  • All tests were run on an AWS C4.8xlarge instance.
  • To minimize the network latency Redis and Redis Load Testing Application were run on the same host.
  • Redis and Redis Load Testing application were run as Docker containers
  • All Docker containers had assigned fixed number of CPUs and were run in real-time mode to avoid that OS Scheduler assigns the CPU core to another process.
  • As networking docker's "host networking" was used to avoid any kind of overlay networking latencies
  • Tests were also compared with Redis Benchmark
  • For all known Redis tools (redis-benchmark, redis-cli) including Redis server the official docker image v5.0.4 was used: https://hub.docker.com/_/redis
  • As an operating system Amazon Linux default AMI was used
  • OS host was provisioned with Redis friendly config:
# echo never > /sys/kernel/mm/transparent_hugepage/enabled
# sysctl vm.overcommit_memory=1
  • Redis was started with the following configuration:
docker run -d --sysctl net.core.somaxconn=4096 --ulimit nofile=10032:10032 --net=host --cap-add=sys_nice --ulimit rtprio=100  --cpuset-cpus 0,1,2,3 --name=redis-server redis:5.0.4 --bind 127.0.0.1 --port 6379 --protected-mode no --save "" --appendonly no --dir ./ --daemonize no --dbfilename ""
  • Redis logs indicate no warnings:
[ec2-user@ip-10-0-0-223 ~]$ docker logs -f redis-server
1:C 14 Apr 2019 17:05:35.998 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 14 Apr 2019 17:05:35.998 # Redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 14 Apr 2019 17:05:35.998 # Configuration loaded
1:M 14 Apr 2019 17:05:35.999 * Running mode=standalone, port=6379.
1:M 14 Apr 2019 17:05:35.999 # Server initialized
1:M 14 Apr 2019 17:05:35.999 * Ready to accept connections
  • All tests were run with coroutine enabled code, but single threaded to allow comparison with Redis Benchmark
  • All tests were pure GET command tests on a prefilled database with 1M keys and payload size of 200 bytes (0% miss rate).
  • There were 2 types of tests run:
    • without pipelines
    • with pipelines (32 commands per pipeline)

Test Results

Notes:

  • TPS stands for transactions per second
  • pN stands for latency
  • Pipeline latency was measured for entire pipeline and not for a single command
  • Redis Benchmark does not provide fixed latency statistics, thus the latencies matching the table ones were taken where possible
  • Redis Benchmark in the official docker container v5.0.4 did not provide sub-millisecond latency values.
  • Redis Benchmark command used for test:
# no-pipelining -> adopt connection count
docker run -ti --rm --net=host --cap-add=sys_nice --ulimit rtprio=100  --cpuset-cpus 4,5 redis:5.0.4 redis-benchmark  -t get -r 1000000 -n 10000000 -h 127.0.0.1 -c 1 -P 1

# pipelining -> adopt connection count
docker run -ti --rm --net=host --cap-add=sys_nice --ulimit rtprio=100  --cpuset-cpus 4,5 redis:5.0.4 redis-benchmark  -t get -r 1000000 -n 10000000 -h 127.0.0.1 -c 1 -P 2
  • Proprietary tool written in C++ and using bredis had similar setup, but in addition customizable parsing policy.

Actual Results:

  • 1 connections / loopback / 30 seconds / pipeline size: 1
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 29568 0.00 24 29 32 34 45 52 135
bredis: header parse 29144 -1.43 22 30 33 35 45 53 129
bredis: full parse 28323 -4.21 24 30 33 35 46 55 143
redis benchmark 35845 21.23 0
  • 10 connections / loopback / 30 seconds / pipeline size: 1
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 97985 0.00 48 96 99 111 126 139 213
bredis: header parse 93941 -4.13 43 101 103 115 129 144 1583
bredis: full parse 87556 -10.64 45 107 110 122 136 150 253
redis benchmark 100055 2.11 0
  • 50 connections / loopback / 30 seconds / pipeline size: 1
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 93396 0.00 501 512 531 560 588 610 656
bredis: header parse 92817 -0.62 504 515 535 563 589 626 677
bredis: full parse 85235 -8.74 544 560 583 613 643 672 746
redis benchmark 101672 8.86 1000
  • 100 connections / loopback / 30 seconds / pipeline size: 1
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 91197 0.00 1028 1060 1092 1133 1167 1197 1435
bredis: header parse 87045 -4.55 1069 1106 1145 1192 1232 1270 1368
bredis: full parse 82596 -9.43 1121 1171 1206 1253 1297 1329 1417
redis benchmark 99661 9.28 1000 1000
  • 200 connections / loopback / 30 seconds / pipeline size: 1
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 87715 0.00 2154 2231 2274 2332 2393 2507 2615
bredis: header parse 85981 -1.98 2181 2266 2321 2390 2456 2507 2579
bredis: full parse 81932 -6.59 2290 2387 2437 2495 2555 2647 2812
redis benchmark 95858 9.28 2000 2000
  • 400 connections / loopback / 30 seconds / pipeline size: 1
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 84914 0.00 4469 4632 4700 4798 4907 5070 6876
bredis: header parse 83089 -2.15 4518 4719 4802 4917 5052 5366 7928
bredis: full parse 81146 -4.44 4627 4823 4918 5042 5177 5470 7294
redis benchmark 94638 11.45 1000 3000 4000
  • 1 connections / loopback / 30 seconds / pipeline size: 32
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 305413 0.00 88 1450 1450 1450 1450 1450 1450
bredis: header parse 294051 -3.72 90 100 104 115 122 147 447
bredis: full parse 264363 -13.44 93 101 105 116 122 148 209
redis benchmark 360880 18.16 0
  • 10 connections / loopback / 30 seconds / pipeline size: 32
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 526654 0.00 358 908 908 908 908 908 908
bredis: header parse 527584 0.18 379 473 599 724 762 818 948
bredis: full parse 524862 -0.34 333 413 592 772 810 851 928
redis benchmark 519379 -1.38 1000 1000
  • 50 connections / loopback / 30 seconds / pipeline size: 32
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 532309 0.00 1392 4611 4611 4611 4611 4611 4611
bredis: header parse 529194 -0.59 1458 2377 3017 3667 3838 3929 4625
bredis: full parse 525472 -1.28 1339 2188 2994 3910 4141 4254 4964
redis benchmark 532453 0.03 1000 4000 4000
  • 100 connections / loopback / 30 seconds / pipeline size: 32
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 537409 0.00 3733 8504 8504 8504 8504 8504 8504
bredis: header parse 535626 -0.33 3615 4714 5950 7331 7713 7869 8535
bredis: full parse 532882 -0.84 2735 4340 5901 7730 8198 8406 9325
redis benchmark 536394 -0.19 6000 7000 8000
  • 200 connections / loopback / 30 seconds / pipeline size: 32
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 535314 0.00 8006 16723 16723 16723 16723 16723 16723
bredis: header parse 529886 -1.01 7775 9559 11874 14911 15559 16032 17020
bredis: full parse 532803 -0.47 6123 8648 11653 15613 16464 16941 18322
redis benchmark 537634 0.43 12000 14000 19000
  • 400 connections / loopback / 30 seconds / pipeline size: 32
TPS Diff to drop [%] p0 [us] p10 [us] p50 [us] p90 [us] p99 [us] p99.9 [us] p99.99 [us] p100 [us]
bredis: drop 521525 0.00 17603 34790 34790 34790 34790 34790 34790
bredis: header parse 518396 -0.60 16911 19850 24502 30372 32328 33816 36015
bredis: full parse 514251 -1.39 14166 17718 24240 32288 34516 35760 37683
redis benchmark 529913 1.61 4000 21000 23000 26000 27000 32000 40000 40000

Some Notes

When running the tests without pipelining with low number of connections it is clearly observable, that Redis CPU utilization stays under 90%, which lets the performance and efficiency in the benchmarking tools compete. With higher number of connections or bigger pipelines CPU utilization of Redis reaches 100%. Given that, there is no real competition (or very minimal one) of benchmarking tools but more like which tool is more lucky to get faster response from Redis.

Maybe it'd be a good idea to have a test for performance benchmarking tool, which repeatedly reads the same key. Doing so, it'd put that key into Cache and make Redis serve it in the fastest possible way. Finally, it can be even more advantageous to avoid real TCP Sockets but using Unix Domain Sockets instead which can result in much better throughput and lower latency.

@basiliscos
Copy link
Owner

@ovanes Thanks a lot for sharing the results. Let's keep that page, as it might be interesting for other people.

I have also a few ideas how to improve performance yet further.

@ovanes
Copy link
Author

ovanes commented Apr 15, 2019

I put more thoughts into the test result interpretation...

@basiliscos
Copy link
Owner

Yes, please, go ahead.

In the current implementation, it performs double-parsing, first pass to determine the end of expected reply (i.e .with drop policy), and the 2nd pass to deliver reply to client code.

It also interesting, how you get the numbers for redis benchmark row.

@ovanes
Copy link
Author

ovanes commented Apr 16, 2019

@basiliscos Unfortunately, I don't fully understand that question:

It also interesting, how you get the numbers for redis benchmark row.

IMO redis-benchmark commands that were used are in the above test description:

# no-pipelining -> adopt connection count
docker run -ti --rm --net=host --cap-add=sys_nice --ulimit rtprio=100  --cpuset-cpus 4,5 redis:5.0.4 redis-benchmark  -t get -r 1000000 -n 10000000 -h 127.0.0.1 -c 1 -P 1

# pipelining -> adopt connection count
docker run -ti --rm --net=host --cap-add=sys_nice --ulimit rtprio=100  --cpuset-cpus 4,5 redis:5.0.4 redis-benchmark  -t get -r 1000000 -n 10000000 -h 127.0.0.1 -c 1 -P 2

Just replace the values in -c and -P parameters for corresponding number of connections and commands in the pipeline. redis-benchmark is a highly optimized tool. I had to do a lot of tweaks to land close comparison (initially my tests were about 30% to 40% less performing). I might have a bit more optimization ideas, but IMO they might improve the TPS by 1% to 2%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants