[live-data-fetcher] Distribute read load across objects in S3 #817

SiddharthManoj · 2023-11-30T08:17:13Z

See Experiments - S3 Prefix Sharding Design Doc for more info.

See #incident-3788--slower-experiment-changes-due-to-s3-rate-limiting for full incident context.

Problem (as copied from doc)
On the week of November 27th, we started to observe that the rate of requests for experiment config data to the S3 bucket was breaching the S3 rate limit of 5500 QPS. The main trigger condition is when experiments are changed, the manifest is uploaded (every X min), and a ZooKeeper Data Watch occurs. This subsequently triggers all service pods to refetch a copy of the new experiment config data (with a retry policy of 5 attempts).

When all pods attempt to fetch from the same S3 file at once, we breach the 5500 QPS limit set by AWS sending a large number of pods into a retry state (up to 5 retries).

These retries are done using boto3’s built-in exponential backoff strategy with a factor of 2 (with jitter). Ultimately we observed that while a retrying Pod may not receive the experiment config on the 1st try, 99% of them receive it by at most the 5th try. However, 1% of Pods do not get it even on the 5th retry and thus stop retrying - therefore these pods do not receive the most up-to-date experiment data until another manifest upload and ZK data watch is triggered.

Solution

Experiments service will send over multiple copies of the manifest - each prefixed numerically. See: https://github.snooguts.net/reddit/reddit-service-experiment-config/pull/3616

In the baseplate sidecar, we just read in the num_file_shards key from ZK and randomly generate a number from 1 to num_file_shards. We prepend that prefix to the file name followed by the delimiter ("/") and then attempt to fetch that file from S3.

Benefits

By having these prefixed in the way above, we make use of S3's built in prefix partitioning for read performance. This means that pod reads can be distributed among the files (random load balancing).

See AWS Design Practices doc for more info

Is this backwards compatible with older versions of the sidecar?

Yes, old versions of the sidecar won't know about the num_file_shards key in ZK and will just pull down the original file provided in the ZK node (as it works today).

Only new updated versions will attempt to prepend a prefix.

Old versions will always pull a file that looks like:

2023-10-31T16:13:18Z_3437442509_29cd4048e52252cb_experiment-config.json (original file)

New versions will pull files that look like:

1/2023-10-31T16:13:18Z_3437442509_29cd4048e52252cb_experiment-config.json

2/2023-10-31T16:13:18Z_3437442509_29cd4048e52252cb_experiment-config.json

3/2023-10-31T16:13:18Z_3437442509_29cd4048e52252cb_experiment-config.json

4/2023-10-31T16:13:18Z_3437442509_29cd4048e52252cb_experiment-config.json

SiddharthManoj · 2023-11-30T15:14:40Z

tests/unit/sidecars/live_data_watcher_tests.py

+        default_file_key = "test_file_key"
+        for file_shard_num in range(NUM_FILE_SHARDS):
+            if file_shard_num == 0:
+                # The first copy should just be the original file.
+                sharded_file_key = default_file_key
+            else:
+                # All other copies should include the sharded prefix.
+                sharded_file_key = str(file_shard_num) + "/" + default_file_key
+            s3_client.put_object(
+                Bucket=bucket_name,
+                Key=sharded_file_key,
+                Body=json.dumps(s3_data).encode(),
+                SSECustomerKey="test_decryption_key",
+                SSECustomerAlgorithm="AES256",
+            )


For unit test, we can upload all the copies of the file in the setUp.

And then in each unit test, we can optionally provide the num_file_shards key in the ZK content. If the key is missing, we should expect the unit test to still pass since it should fetch the original file with no prefix.

If the key is present, we should also expect it to pass by fetching one of the prefixed files.

SiddharthManoj · 2023-11-30T15:15:39Z

tests/unit/sidecars/live_data_watcher_tests.py

+        # For safe measure, run this 20 times. It should succeed every time.
+        # We've uploaded 5 files to S3 in setUp() and num_file_shards=5 in the
+        # ZK node so we should be fetching one of these 5 files randomly (and successfully)
+        # and all should have the same content.
+        for i in range(20):


I couldn't think of a better way to do this than to run on_change() X number of times. This is because the only output of the function being unit tested is the contents of the manifest so functionally that's all I can test.

on_change() has the random number generation logic internally.

By running on_change() 50 times, i would expect different random numbers to have generated at least once, so we would pull from one of the 5 files uploaded in setUp() randomly and I would expect it to succeed and have the same contents each time.

SiddharthManoj · 2023-11-30T15:24:31Z

@areitz Just wanted to get the PR out there. We can still wait for the doc to be reviewed first and also the Experiments Service PR has to go in first: https://github.snooguts.net/reddit/reddit-service-experiment-config/pull/3616

SiddharthManoj · 2023-11-30T16:10:46Z

tests/unit/sidecars/live_data_watcher_tests.py

@@ -11,8 +11,11 @@

 from moto import mock_s3

+from baseplate.sidecars.live_data_watcher import _generate_sharded_file_key


I know _generate_sharded_file_key has an underscore prefix, but we're only using it internally in reality.

I just need to call it here as part of the unit test

areitz

This looks super awesome, thanks!

I had a few minor questions, nothing blocking really. I can approve without code changes once I understand the answers to my questions.

baseplate/sidecars/live_data_watcher.py

tests/unit/sidecars/live_data_watcher_tests.py

areitz · 2023-11-30T16:41:11Z

tests/unit/sidecars/live_data_watcher_tests.py

        expected_sharded_file_key = "test_file_key"
-        self.assertEqual(actual_sharded_file_key, expected_sharded_file_key)
+        possible_no_sharding_values = [0, 1, None]


baseplate/sidecars/live_data_watcher.py

mrlevitas

one suggestion for how to make the random part of the tests deterministic

tests/unit/sidecars/live_data_watcher_tests.py

baseplate/sidecars/live_data_watcher.py

SiddharthManoj · 2023-11-30T21:01:11Z

baseplate/sidecars/live_data_watcher.py

+        # Default # of retries in legacy mode (current mode) is 5.
        s3_client = boto3.client(
            "s3",
-            config=Config(signature_version=UNSIGNED, retries={"total_max_attempts": 10}),
+            config=Config(signature_version=UNSIGNED),
            region_name=region_name,
        )


Per https://reddit.slack.com/archives/C067Q6N14LA/p1701373678708859?thread_ts=1701370864.667409&cid=C067Q6N14LA we want to revert the changes made in PR/813 where we increased retry total_max_attempt from 5->10

cc @areitz @jerroydmoore

* Add logic to distribute read load across different files in S3 * lint * add test for live data watcher * intentionally fail test to verify * intentionally fail test to verify * remove logs * verify this fails * and it does * change range * change range * change range * comments * comments * use function * move more things into helper * fix unit test * fix unit test * rename unit test * rename unit test * import order * add logging since a test failed * import order * move logger up * oops, the delimiter should have been / not _ * add 0 and 1 to no sharding unit test * clean up python * clean up python * comments * add negative values to test * revert retries from 10 -> 5 --------- Co-authored-by: Siddharth Manoj <[email protected]>

Siddharth Manoj added 5 commits November 30, 2023 03:16

Add logic to distribute read load across different files in S3

6925337

lint

6d08a47

add test for live data watcher

f139232

intentionally fail test to verify

624d359

intentionally fail test to verify

046df38

SiddharthManoj changed the title ~~[live-data-fetcher] Distribute read load across objects in S3 if num_file_shards key is present~~ [live-data-fetcher] Distribute read load across objects in S3 Nov 30, 2023

remove logs

52516a2

SiddharthManoj force-pushed the develop branch from be3b35a to 52516a2 Compare November 30, 2023 09:54

Siddharth Manoj added 5 commits November 30, 2023 04:57

verify this fails

ad035a7

and it does

0297f50

change range

b844da5

change range

f3058f3

change range

6699197

SiddharthManoj commented Nov 30, 2023

View reviewed changes

SiddharthManoj marked this pull request as ready for review November 30, 2023 15:22

SiddharthManoj requested a review from a team as a code owner November 30, 2023 15:22

SiddharthManoj requested review from KTAtkinson and areitz November 30, 2023 15:22

Siddharth Manoj added 9 commits November 30, 2023 10:25

comments

aa0eff1

comments

d16b210

use function

cd0e4cf

move more things into helper

7ce1296

fix unit test

907acae

fix unit test

f70e98e

rename unit test

2d9b459

rename unit test

c9d0bfb

import order

203a8f6

SiddharthManoj requested a review from kylelemons November 30, 2023 16:09

SiddharthManoj commented Nov 30, 2023

View reviewed changes

SiddharthManoj requested review from julieahnt and mrlevitas November 30, 2023 16:10

Siddharth Manoj added 4 commits November 30, 2023 11:15

add logging since a test failed

f386fb5

import order

fc9162e

move logger up

8ae0658

oops, the delimiter should have been / not _

9f97142

areitz requested changes Nov 30, 2023

View reviewed changes

baseplate/sidecars/live_data_watcher.py Show resolved Hide resolved

tests/unit/sidecars/live_data_watcher_tests.py Show resolved Hide resolved

tests/unit/sidecars/live_data_watcher_tests.py Outdated Show resolved Hide resolved

add 0 and 1 to no sharding unit test

3db971d

SiddharthManoj requested a review from areitz November 30, 2023 16:38

areitz approved these changes Nov 30, 2023

View reviewed changes

KTAtkinson reviewed Nov 30, 2023

View reviewed changes

baseplate/sidecars/live_data_watcher.py Show resolved Hide resolved

mrlevitas approved these changes Nov 30, 2023

View reviewed changes

tests/unit/sidecars/live_data_watcher_tests.py Show resolved Hide resolved

kylelemons approved these changes Nov 30, 2023

View reviewed changes

baseplate/sidecars/live_data_watcher.py Outdated Show resolved Hide resolved

jerroydmoore approved these changes Nov 30, 2023

View reviewed changes

Siddharth Manoj added 5 commits November 30, 2023 14:28

clean up python

a976e5a

clean up python

f94047b

comments

3264b30

add negative values to test

b661929

revert retries from 10 -> 5

0691085

SiddharthManoj commented Nov 30, 2023

View reviewed changes

KTAtkinson approved these changes Nov 30, 2023

View reviewed changes

SiddharthManoj merged commit c132681 into reddit:develop Nov 30, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[live-data-fetcher] Distribute read load across objects in S3 #817

[live-data-fetcher] Distribute read load across objects in S3 #817

SiddharthManoj commented Nov 30, 2023 •

edited

Loading

SiddharthManoj Nov 30, 2023

SiddharthManoj Nov 30, 2023 •

edited

Loading

SiddharthManoj commented Nov 30, 2023

SiddharthManoj Nov 30, 2023 •

edited

Loading

areitz left a comment

areitz Nov 30, 2023

mrlevitas left a comment

SiddharthManoj Nov 30, 2023

		@@ -11,8 +11,11 @@

		from moto import mock_s3

		from baseplate.sidecars.live_data_watcher import _generate_sharded_file_key

[live-data-fetcher] Distribute read load across objects in S3 #817

[live-data-fetcher] Distribute read load across objects in S3 #817

Conversation

SiddharthManoj commented Nov 30, 2023 • edited Loading

Solution

Benefits

Is this backwards compatible with older versions of the sidecar?

SiddharthManoj Nov 30, 2023

Choose a reason for hiding this comment

SiddharthManoj Nov 30, 2023 • edited Loading

Choose a reason for hiding this comment

SiddharthManoj commented Nov 30, 2023

SiddharthManoj Nov 30, 2023 • edited Loading

Choose a reason for hiding this comment

areitz left a comment

Choose a reason for hiding this comment

areitz Nov 30, 2023

Choose a reason for hiding this comment

mrlevitas left a comment

Choose a reason for hiding this comment

SiddharthManoj Nov 30, 2023

Choose a reason for hiding this comment

SiddharthManoj commented Nov 30, 2023 •

edited

Loading

SiddharthManoj Nov 30, 2023 •

edited

Loading

SiddharthManoj Nov 30, 2023 •

edited

Loading