Introduce reusable query buffer for client reads #337

sundb · 2024-08-22T05:11:35Z

This PR is based on the commits from PR valkey-io/valkey#258, valkey-io/valkey#593, valkey-io/valkey#639

This PR optimizes client query buffer handling in Redis by introducing
a reusable query buffer that is used by default for client reads. This
reduces memory usage by ~20KB per client by avoiding allocations for
most clients using short (<16KB) complete commands. For larger or
partial commands, the client still gets its own private buffer.

The primary changes are:

Adding a reusable query buffer thread_shared_qb that clients use by default.
Modifying client querybuf initialization and reset logic.
Freeing idle client query buffers when empty to allow reuse of the reusable query buffer.
Master client query buffers are kept private as their contents need to be preserved for replication stream.
When nested commands is executed, only the first user uses the reuse buffer, and subsequent users will still use the private buffer.

In addition to the memory savings, this change shows a 3% improvement in
latency and throughput when running with 1000 active clients.

The memory reduction may also help reduce the need to evict clients when
reaching max memory limit, as the query buffer is the main memory
consumer per client.

This PR is different from valkey-io/valkey#258

When a client is in the mid of requiring a reused buffer and returning it, regardless of whether the query buffer has changed (expanded), we do not update the reused query buffer in the middle, but return the reused query buffer (expanded or with data remaining) or reset it at the end.
Adding a new thread variable thread_shared_qb_used to avoid multiple clients requiring the reusable query buffer at the same time.

Signed-off-by: Uri Yagelnik [email protected]
Signed-off-by: Madelyn Olson [email protected]
Co-authored-by: Uri Yagelnik [email protected]
Co-authored-by: Madelyn Olson [email protected]

This PR optimizes client query buffer handling in Valkey by introducing a shared query buffer that is used by default for client reads. This reduces memory usage by ~20KB per client by avoiding allocations for most clients using short (<16KB) complete commands. For larger or partial commands, the client still gets its own private buffer. The primary changes are: * Adding a shared query buffer `shared_qb` that clients use by default * Modifying client querybuf initialization and reset logic * Copying any partial query from shared to private buffer before command execution * Freeing idle client query buffers when empty to allow reuse of shared buffer * Master client query buffers are kept private as their contents need to be preserved for replication stream In addition to the memory savings, this change shows a 3% improvement in latency and throughput when running with 1000 active clients. The memory reduction may also help reduce the need to evict clients when reaching max memory limit, as the query buffer is the main memory consumer per client. --------- Signed-off-by: Uri Yagelnik <[email protected]> Signed-off-by: Madelyn Olson <[email protected]> Co-authored-by: Madelyn Olson <[email protected]>

redis#593) Test `query buffer resized correctly` start to fail (https://github.com/valkey-io/valkey/actions/runs/9278013807) with non-jemalloc allocators after valkey-io/valkey#258 PR. With Jemalloc we allocate ~20K for the query buffer, in the test we read 1 byte in the first read, in the second read we make sure we have at least 16KB free place in the query buffer and we have as Jemalloc allocated 20KB, But with non jemalloc we allocate in the first read exactly 16KB. in the second read we check and see that we don't have 16KB free space as we already read 1 byte hence we reallocate this time greedly (*2 of the requested size of 16KB+1) hence the test condition that the querybuf size is < 32KB is no longer true The `query buffer resized correctly test` starts [failing](https://github.com/valkey-io/valkey/actions/runs/9278013807) with non-jemalloc allocators after PR #258 . With jemalloc, we allocate ~20KB for the query buffer. In the test, we read 1 byte initially and then ensure there is at least 16KB of free space in the buffer for the second read, which is satisfied by jemalloc's 20KB allocation. However, with non-jemalloc allocators, the first read allocates exactly 16KB. When we check again, we don't have 16KB free due to the 1 byte already read. This triggers a greedy reallocation (doubling the requested size of 16KB+1), causing the query buffer size to exceed the 32KB limit, thus failing the test condition. This PR adjusted the test query buffer upper limit to be 32KB +2. Signed-off-by: Uri Yagelnik <[email protected]>

…s using

…_qb_used" ProcessingEventsWhileBlocked is not thread safe This reverts commit 9d70fa3.

oranagra

did you look for any followup commits fixing bugs introduced by the one you cherry picked? maybe comparing these areas in the code with the latest branch?

oranagra · 2024-08-24T06:56:09Z

src/networking.c

+__thread sds thread_shared_qb = NULL;
+__thread int thread_shared_qb_used = 0; /* Avoid multiple clients using shared query
+                                         * buffer due to nested command execution. */


only now (when looking at the code for the first time) i realize the buffer isn't shared. it's re-usable.
i.e. not serving multiple clients at the same time.

that said, i suppose we won't want to rename it...

yeah, it's more of a public query buffer.
do you mean it should be thread_querybuffer or thread_querybuffer_reusable?

in my eyes it's a bad term for this purpose. i'd think that a shared buffer is one that used by multiple entities at the same time. i'd just replace the term "shared" with "reusable", in both variable names and comments.
but considering we might wanna cherry pick later fixes from valkey, this rename might be unproductive.
so i'd be ok with keeping the name and just editing the comment that describes it.
your call.

oranagra · 2024-08-24T07:26:39Z

src/networking.c

+ * and a new empty buffer will be allocated for the shared buffer. */
+static void resetSharedQueryBuf(client *c) {
+    serverAssert(c->flags & CLIENT_SHARED_QUERYBUFFER);
+    if (c->querybuf != thread_shared_qb || sdslen(c->querybuf) > c->qb_pos) {


c->querybuf != thread_shared_qb how is this possible?
and if it is, do we really want to do thread_shared_qb = NULL?

c->querybuf != thread_shared_qb how is this possible?

c->querybuf may be expanded in processMultibulkBuffer().
https://github.com/redis/redis/blob/60f22ca830c59a630b4156b112f5e73ce75adc64/src/networking.c#L2401

and if it is, do we really want to do thread_shared_qb = NULL?

this means that c->querybuf has acquired ownership, the old pointer to thread_shared_qb is invalid, and we need to reset it so that it can be created again when it is used again.

oranagra · 2024-08-24T07:53:57Z

src/networking.c

@@ -2674,6 +2701,7 @@ void readQueryFromClient(connection *conn) {
    if (c->reqtype == PROTO_REQ_MULTIBULK && c->multibulklen && c->bulklen != -1
        && c->bulklen >= PROTO_MBULK_BIG_ARG)
    {
+        if (!c->querybuf) c->querybuf = sdsempty();


maybe add a comment that we don't reuse the shared buffer here because we aim for the big arg optimization? or do you think it's clear form the context above?

done with 3ebb1b3 (#337)

oranagra · 2024-08-24T18:15:02Z

tests/unit/querybuf.tcl

@@ -63,7 +104,7 @@ start_server {tags {"querybuf slow"}} {
            # Write something smaller, so query buf peak can shrink
            $rd set x [string repeat A 100]
            set new_test_client_qbuf [client_query_buffer test_client]
-            if {$new_test_client_qbuf < $orig_test_client_qbuf} { break } 
+            if {$new_test_client_qbuf < $orig_test_client_qbuf && $new_test_client_qbuf > 0} { break } 


was there a race condition here?

this is CP mistake from the unstable Valkey.

oranagra · 2024-08-24T18:18:29Z

tests/unit/querybuf.tcl

@@ -78,6 +119,11 @@ start_server {tags {"querybuf slow"}} {
        $rd write "*3\r\n\$3\r\nset\r\n\$1\r\na\r\n\$1000000\r\n"
        $rd flush

+        after 200


we need to wait for redis to read that incomplete command? maybe we better use wait_for to avoid timing issues.

same in theory applies for the after 20 below, but i'm not sure we can detect it.
maybe by looking at the argv-mem and qbuf fields in CLIENT LIST?

we need to wait for redis to read that incomplete command? maybe we better use wait_for to avoid timing issues.

fixed in e5a4a67 (#337).

same in theory applies for the after 20 below, but i'm not sure we can detect it. maybe by looking at the argv-mem and qbuf fields in CLIENT LIST?

IIRC, the after 20 is used to verify that the next client cron doesn't shrink the client's query buffer.
in e5a4a67 (#337), we turn on cron before after, so after 120 should be more reasonable.

well, that can still have a race condition, if we want to wait for cron, we can maybe use a wait_for on some new tick metric. but we don't do that elsewhere. i suppose 120 is good.

yes, but in theory it's hard to run for more than 2 seconds, I'll check it from daily CI.

no need. if we'll ever see it fail we'll adjust. we have similar after 120 in other places.

oranagra · 2024-08-24T18:20:56Z

tests/unit/querybuf.tcl

+    test "Client executes small argv commands using shared query buffer" {
+        set rd [redis_deferring_client]
+        $rd client setname test_client
+        set res [r client list]


we have no guarantee that the previous command (setname) was run (missing rd read)

missing rd read, fixed it.

Co-authored-by: oranagra <[email protected]>

sundb · 2024-08-26T05:22:55Z

did you look for any followup commits fixing bugs introduced by the one you cherry picked? maybe comparing these areas in the code with the latest branch?

OK, i'll double check it.

We've been seeing some pretty consistent failures from `test-valgrind-test` and `test-sanitizer-address` because of the querybuf test periodically failing. I tracked it down to the test periodically taking too long and the client cron getting triggered. A simple solution is to just disable the cron during the key race condition. I was able to run this locally for 100 iterations without seeing a failure. Example: https://github.com/valkey-io/valkey/actions/runs/9474458354/job/26104103514 and https://github.com/valkey-io/valkey/actions/runs/9474458354/job/26104106830. Signed-off-by: Madelyn Olson <[email protected]>

oranagra · 2024-08-26T10:58:53Z

src/networking.c

@@ -28,7 +28,7 @@ int postponeClientRead(client *c);
 char *getClientSockname(client *c);
 int ProcessingEventsWhileBlocked = 0; /* See processEventsWhileBlocked(). */
 __thread sds thread_shared_qb = NULL;
-__thread int thread_shared_qb_used = 0; /* Avoid multiple clients using shared query
+__thread int thread_shared_qb_used = 0; /* Avoid multiple clients using reusable query


you renamed everything, but didn't rename the variable

🙃i only updated the name in the comments and the resetReusableQueryBuf() method (I added), so it looks like almost everything has changed.

ok, it looked like you changed a lot of lines for that.. (could cause a lot of conflict).
i thought i'm suggesting just editing one comment.
but i didn't look at the original code. do what you think is best.

i actually reworked a pretty big chunk of code, the code is now much more efficient and readable.

ok, so feel free to rename the variable too

uriyage and others added 2 commits August 20, 2024 15:14

sundb force-pushed the shared_qb branch from 4352daa to 0514053 Compare August 22, 2024 08:59

New way to use shared query buffer

4723b01

sundb force-pushed the shared_qb branch from 0514053 to 4723b01 Compare August 22, 2024 09:00

sundb added 10 commits August 22, 2024 17:04

Add license

c4b3090

Fix missing clear c->querybuf

908268d

Skip shared query buffer for client list info

ebf45de

Revert code style

b50f5cc

Allocate PROTO_IOBUF_LEN for querybuf that can't use the shared qb

0569168

If the client is using shared qb, we also count it as the memory it i…

7125bc6

…s using

Stablize querybuffer test

1c37cc5

Add test

9b8d07e

Use ProcessingEventsWhileBlocked instead adding thread_shared_qb_used

9d70fa3

Revert "Use ProcessingEventsWhileBlocked instead adding thread_shared…

0b64a50

…_qb_used" ProcessingEventsWhileBlocked is not thread safe This reverts commit 9d70fa3.

oranagra reviewed Aug 24, 2024

View reviewed changes

sundb and others added 2 commits August 26, 2024 13:05

Improve tests

e5a4a67

Co-authored-by: oranagra <[email protected]>

Add comment for big argv

3ebb1b3

Co-authored-by: oranagra <[email protected]>

oranagra approved these changes Aug 26, 2024

View reviewed changes

sundb changed the title ~~Introduce shared query buffer for client reads~~ Introduce reusable query buffer for client reads Aug 26, 2024

sundb added 3 commits August 26, 2024 16:01

Replace shared concept with reusable

e472780

Rename remain shared to reusable in comments

630c739

Rename remain shared to reusable in comments

5ea62d0

oranagra reviewed Aug 26, 2024

View reviewed changes

Rename thread_shared_qb* to thread_reusable_qb*

95a5d9e

oranagra approved these changes Aug 26, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce reusable query buffer for client reads #337

Introduce reusable query buffer for client reads #337

sundb commented Aug 22, 2024 •

edited

Loading

oranagra left a comment

oranagra Aug 24, 2024

sundb Aug 26, 2024

oranagra Aug 26, 2024

oranagra Aug 24, 2024

sundb Aug 26, 2024

oranagra Aug 24, 2024

sundb Aug 26, 2024

oranagra Aug 24, 2024

sundb Aug 26, 2024

oranagra Aug 24, 2024

sundb Aug 26, 2024

oranagra Aug 26, 2024

sundb Aug 26, 2024

oranagra Aug 26, 2024

oranagra Aug 24, 2024

sundb Aug 26, 2024

sundb commented Aug 26, 2024

oranagra Aug 26, 2024

sundb Aug 26, 2024 •

edited

Loading

oranagra Aug 26, 2024

sundb Aug 26, 2024

oranagra Aug 26, 2024

Introduce reusable query buffer for client reads #337

Are you sure you want to change the base?

Introduce reusable query buffer for client reads #337

Conversation

sundb commented Aug 22, 2024 • edited Loading

oranagra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundb commented Aug 26, 2024

Choose a reason for hiding this comment

sundb Aug 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundb commented Aug 22, 2024 •

edited

Loading

sundb Aug 26, 2024 •

edited

Loading