Skip to content

Lobsters

Kinan Bab edited this page May 2, 2023 · 5 revisions

This wiki lists the steps required to run our Lobsters experiments from the paper (figures 8 and 9).

Both of these experiments make use of the provided Lobsters harness, which is open-loop and multi-threaded. Our setup runs the harness on a separate google cloud instance. We recommend the same setup to replicate our results. Running the harness on the same machine as the database servers is possible but will change the results and how they should be interpreted.

The Lobsters Harness

The provided lobsters harness is adapted from Noria. The harness is designed to reflect the load and data distribution of Lobsters in the real-world. The harness first primes the database according to the data distribution of Lobsters (zipf in the users, i.e.. some users have a lot of data, many users have less data). The harness then issues a load of requests, where each request is a sequence of SQL queries corresponding to a complete endpoint (i.e. the queries to populate the home page, or to post a new story). The type of endpoint is sampled from a distribution reflecting the real-world usage of lobsters (i.e. reads, like viewing the homepage, are more common than writes, like posting a new story). The harness makes concurrent requests from different worker threads.

The harness is controlled by several important parameters:

  1. Data scale (--datascale): a multiplicative factor indicating how much data to prime the database with, relative to the size of the Lobsters application at the time of creating the harness (circa 2018). A larger parameter indicates that the database should be primed with more users, and proportionally more stories, comments, etc. The distribution of data per user is zipf.
  2. Request scale (--reqscale): a multiplicative factor indicating how many endpoint invocations / operations to issue per second after priming, relative to the actual load experienced by lobsters in the real world. A larger parameter indicates more operations per second.

Lobsters Endpoints (Figure 8)

This experiment runs our lobsters harness and measures the 50th and 95th percentile latencies grouped by endpoint. We configure this experiment to use a --datascale = 2.59, which corresponds to 15k users, which is the current size of production lobsters as of Dec 2022.

We chose the request scale to be the largest possible value that the baseline can support given our instance hardware resources. For our setup, this is 1000 (or roughly 766 endpoints/sec, or about 10k individual queries/sec). This configuration should be good if using the same google cloud instance type (n1-standard-16) we are using for both the database server and the harness instances. If you are running over a different setup, you may need to adjust this factor accordingly.

One way to find the best request scale for your setup is to increase or decrease the request scale, and checking the experiment logs to see the achieved operations per second, and choose the largest data scale such that the generated ops/s matches the target ops/s for the MariaDB baseline.

The same request scale should be used for K9db and unencrypted K9db, to ensure an apples-to-apples comparison. In our experience, K9db can usually support a decently larger request scale than the baseline, since it uses caching to handle expensive queries, which reduces the resources it needs to process endpoints.

To run this experiment, you need to make sure you prepare your machines using our experiments instructions. In the following instructions, we assume that you are using the same google cloud setup we recommend. However, you can follow these instructions for different setups provided you adjust the request scale.

First, connect to the server machine where the experiment will run the MariaDB and K9db servers, and run our experiment server script.

# On server machine
cd <K9DB_DIR>
./experiments/scripts/lobsters-endpoints-server.sh
# Wait until you see K9db running before proceeding next

Then, connect to the harness / load generation machine, and run our load generation script.

# On load generation machine
cd <K9DB_DIR>
./experiments/scripts/lobsters-endpoints-load.sh <ip of server machine>
# If using Google cloud, use the internal IP
# Wait until experiment is done. On our setup, this takes roughly 15-20 minutes.

After the experiment finishes running, you will find the outputs on the load generation machine under <K9DB_DIR>/experiments/scripts/outputs/lobsters-endpoints/. Specifically, you will find a pdf file with the plot of the latencies corresponding to figure 8 in the paper, and a txt file with the memory usage numbers corresponding to the memory overhead discussion in section 8.1.1.

The internal logs of the experiment can be found at <K9DB_DIR>/experiments/scripts/logs/lobsters/. Specifically, you can find baseline.out there, which contains the harness measurements for the run against the MariaDB baseline, including the target and generated ops/sec rates that you can use to find the appropriate request scale for your setup.

Lobsters Scaling (Figure 9)

This experiment demonstrates the performance of K9db as a function of the number of users (and data) in the database. The experiment runs the lobsters harness with similar parameters as above but with an increasing data scale each run.

You can run this experiment using:

# On server machine
./experiments/scripts/lobsters-scale-server.sh
# Wait until you see K9db running
# On load generation machine
./experiments/scripts/lobsters-scale-load.sh <ip of server machine>
# If using Google cloud, use the internal IP
# Wait until experiment is done. On our setup, this takes roughly 90 minutes.

After the experiment runs, it will produce a pdf plot corresponding to figure 9 under experiments/scripts/outputs/lobsters-scale/. You can view the internal logs showing the harness output (and any errors) for each data scale under experiments/scripts/logs/lobsters-scale/.

If you are running on a different setup, you will need to change the request scale parameter as in the previous experiment. A good starting ballpark would be to use the same parameter you used in the previous experiment.

On alternative setups, the larger data scales may be too aggressive, and result in a very long priming stage (or crashes). You can change the data scale parameters in the server and load generation script as is appropriate for your setup. The script and plotting will automatically adjust to your new parameters.

Infrequently, the run corresponding to the largest data scales may crash during priming. If this occurs, you can change the script to re-run that data scale. You can look at the internal logs to identify if such a crash occur. If you want the plotting script to skip one of the log files, you can rename that file to start with an _.