Skip to content

Lobsters

Kinan Bab edited this page May 14, 2023 · 5 revisions

This wiki lists the steps required to run our Lobsters experiments from the paper (figures 8 and 9).

Both of these experiments make use of the provided Lobsters harness, which is open-loop and multi-threaded. Our setup runs the harness on a separate google cloud instance. We recommend the same setup to replicate our results. Running the harness on the same machine as the database servers is possible but will change the results and how they should be interpreted.

The Lobsters Harness

The provided lobsters harness is adapted from Noria. The harness is designed to reflect the load and data distribution of Lobsters in the real-world. The harness first primes the database according to the data distribution of Lobsters (zipf in the users, i.e.. some users have a lot of data, many users have less data). The harness then issues a load of requests, where each request is a sequence of SQL queries corresponding to a complete endpoint (i.e. the queries to populate the home page, or to post a new story). The type of endpoint is sampled from a distribution reflecting the real-world usage of lobsters (i.e. reads, like viewing the homepage, are more common than writes, like posting a new story). The harness makes concurrent requests from different worker threads.

The harness is controlled by several important parameters:

  1. Data scale (--datascale): a multiplicative factor indicating how much data to prime the database with, relative to the size of the Lobsters application at the time of creating the harness (circa 2018). A larger parameter indicates that the database should be primed with more users, and proportionally more stories, comments, etc. The distribution of data per user is zipf.
  2. Request scale (--reqscale): a multiplicative factor indicating how many endpoint invocations / operations to issue per second after priming, relative to the actual load experienced by lobsters in the real world. A larger parameter indicates more operations per second.

Lobsters Endpoints (Figure 8)

This experiment runs our lobsters harness and measures the 50th and 95th percentile latencies grouped by endpoint. We configure this experiment to use a --datascale = 2.59, which corresponds to 15k users, which is the current size of production lobsters as of Dec 2022.

We chose the request scale to be the largest possible value that the baseline can support given our instance hardware resources. For our setup, this is 1000 (or roughly 766 endpoints/sec, or about 10k individual queries/sec). This configuration should be good if using the same google cloud instance type (n1-standard-16) we are using for both the database server and the harness instances. If you are running over a different setup, you may need to adjust this factor accordingly. We describe how to do this at the end of this page.

To run this experiment, you need to run the server script on the server machine and the load generation script on the load generation machine. If you are using only one machine, you can run each in a different terminal.

First, connect to the server machine where the experiment will run the MariaDB and K9db servers, and run our experiment server script.

# On server machine
cd <K9DB_DIR>
./experiments/scripts/lobsters-endpoints-server.sh
# Wait until you see K9db running before proceeding next

Then, connect to the harness / load generation machine, and run our load generation script.

# On load generation machine
cd <K9DB_DIR>
./experiments/scripts/lobsters-endpoints-load.sh <ip of server machine>
# If using Google cloud, use the internal IP
# If using a single machine, you can use 127.0.0.1
# Wait until experiment is done. On our setup, this takes roughly 15-20 minutes.

After the experiment finishes running, you will find the outputs on the load generation machine under <K9DB_DIR>/experiments/scripts/outputs/lobsters-endpoints/. Specifically, you will find a pdf file with the plot of the latencies corresponding to figure 8 in the paper, and a txt file with the memory usage numbers corresponding to the memory overhead discussion in section 8.1.1. The internal logs containing the different measurements of the experiment can be found in <K9DB_DIR>/experiments/scripts/logs/lobsters/.

Lobsters Scaling (Figure 9)

This experiment demonstrates the performance of K9db as a function of the number of users (and data) in the database. The experiment runs the lobsters harness with similar parameters as above but with an increasing data scale each run.

You can run this experiment using:

# On server machine
./experiments/scripts/lobsters-scale-server.sh
# Wait until you see K9db running

# On load generation machine
./experiments/scripts/lobsters-scale-load.sh <ip of server machine>
# If using Google cloud, use the internal IP
# If using a single machine, use 127.0.0.1
# Wait until experiment is done. On our setup, this takes roughly 90 minutes.

After the experiment runs, it will produce a pdf plot corresponding to figure 9 under experiments/scripts/outputs/lobsters-scale/. You can view the internal logs showing the harness output (and any errors) for each data scale under experiments/scripts/logs/lobsters-scale/.

If you are running on a different setup, you will need to change the request scale parameter as in the previous experiment. A good starting ballpark would be to use the same parameter you used in the previous experiment.

On alternative setups, the larger data scales may be too aggressive, and result in a very long priming stage (or crashes). You can change the data scale parameters in the server and load generation script as is appropriate for your setup. The script and plotting will automatically adjust to your new parameters.

Infrequently, the run corresponding to the largest data scales may crash during priming. If this occurs, you can change the script to re-run that data scale. You can look at the internal logs to identify if such a crash occur. If you want the plotting script to skip one of the log files, you can rename that file to start with an _.

Choosing The Request Load Parameter

One way to find the best request scale for your setup is to increase or decrease the request scale, and checking the experiment logs to see the achieved operations per second, and choose the largest data scale such that the generated ops/s matches the target ops/s for the MariaDB baseline.

The same request scale should be used for K9db and unencrypted K9db, to ensure an apples-to-apples comparison. In our experience, K9db can usually support a decently larger request scale than the baseline, since it uses caching to handle expensive queries, which reduces the resources it needs to process endpoints.

In our experience, the same request scale works for both the endpoints and scaling experiments.

To find the best request scale for your setup. Follow the following steps:

First, ensure that MariaDB is running on the server machine (or the same machine if you are using only one), e.g. using sudo service mariadb start.

Second, prime the MariaDB database with the same data scale parameter as our endpoints experiment.

# On load generation machine
cd <K9DB_DIR>/experiments/lobsters
bazel run -c opt //:lobsters-harness -- \
  --runtime 0 --datascale 2.59 --queries pelton \
  --backend rocks-mariadb --prime --scale_everything \
  "mysql://k9db:password@<DB_IP>:3306/lobsters"
# Set <DB_IP> to be the LOCAL IP of the server google cloud instance or
# 127.0.0.1 if you are running the harness on the same machine

The priming setup takes around 5 minutes on our setup. This only inserts data into the database, it doesn't run any load against it afterwards.

Third, backup the primed database so that you can run multiple loads against without having to re-prime it.

mysqldump -u k9db -ppassword --host=<DB_IP> lobsters > dump.sql

Now, you can run load with different request scales (without having to re-prime the database every-time). After every run, the harness will display the target number of endpoints per second that it is required to reach to meet the request scale parameter, and the actual number of endpoints per second that the server was able to process. If the request scale is too aggressive for your setup, the generated ops/second will be significantly lower than the target. Thus, you can start with --reqscale 1000, and then decrease that if needed until you find the maximum load with which the setup can keep up with the target.

# Run load with some request scale.
bazel run -c opt //:lobsters-harness -- \
  --runtime 30 --datascale 2.59 --reqscale <REQ_SCALE> --queries pelton \
  --backend rocks-mariadb --scale_everything \
  "mysql://k9db:password@<DB_IP>:3306/lobsters"

# Reload primed database backup
mariadb -u k9db -ppassword --host=<DB_IP> lobsters < dump.sql

# You can repeat these two steps to run loads with a different request scales.

For example, on a local machine, I first run a load with request scale = 1000, which turns out to be too aggressive

# target ops/s: 766.67
# generated ops/s: 304.24
# dropped requests: 0

Then, I run it for request scale = 100, which my local machine can keep up with.

# target ops/s: 76.67
# generated ops/s: 74.33
# dropped requests: 0

I make additional runs bisecting the request scale until I find request scale = 500 to be the largest request scale that my local machine can keep up with. Thus, on my local machine, I need to modify the lobsters endpoints and scale load generation scripts to use RS=500.

# target ops/s: 383.33
# generated ops/s: 382.94
# dropped requests: 0

Other Parameters

The harness creates many threads for priming and load generation depending on how many cores are available on your system. This is not an issue during load generation as the request load controls how many requests these threads issue per second. However, during priming, the harness continuously issues requests from every thread as soon as it is possible.

If your system has a slow disk relative to the amount of cores or CPU power, this may result in errors during priming, especially with the MariaDB baseline, as the system may be overwhelmed with requests, causing various requests and locks to timeout. In such a case, you can modify the INFLIGHT variable in experiments/scripts/lobsters-endpoints-load.sh or experiments/scripts/lobsters-scale-load.sh, to specify how many parallel requests are made during priming.