-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Linkerd Benchmark Setup
We make use of Kinvolk's service-mesh-benchmark framework to perform our benchmarks, which compare Linkerd's latency and resource consumption against Istio and a baseline (no-service-mesh) case. An earlier version of that framework was used in 2019 to generate the results described in this article. The automation has changed since, but please take a look at that article, in particular to learn more about how coordinated omission is taken into account, which is critical in how the results are calculated.
The service-mesh-benchmark project's README contains most of the information required to set up a testing cluster.
Note that we used a fork of the lokomotive project, updated with the latest versions of Linkerd (2.10.2) and Istio (1.10.0).
Also, this required using a Terraform version from the 0.13
branch. You can
find it here.
The benchmark runs on servers provisioned by Equinix
Metal, and also uses S3, DynamoDB and Route 53 to
store state and identify the cluster. Those accounts info needs to be entered
into configs/lokocfg.vars
, while their tokens/credentials have to be provided
as environment variables, as described in the README.
Once you do that, the selected datacenter and server types have to be specified
in configs/equinix-metal-cluster.lokocfg
. After many experiments, the most
consistent results were provided by selecting the dfw2
datacenter using
c2.medium.x86
for the controller and s3.xlarge.x86
for the load generator
and the workers. These values are to be set in the entries facility
,
controller_type
and node_type
respectively.
Running scripts/run_benchmarks.sh
takes care of running the benchmark runs.
Just make sure that you set in run_benchmarks()
main loop the series of RPS
and the number of repetitions you desire.
We also made a small change right after linkerd's installation in that function,
adding linkerd check
to make sure the control plane was ready before
installing emojivoto.
Read Upload Grafana dashboard for instructions on how to setup the Grafana charts.
Result data is taken from the wrk2 benchmark cockpit
chart for each run as
follows:
Taken from the chart "Latency percentile histogram (milliseconds)".
Taken from the charts "Sidecar Memory usage - applications (max. across all sidecars)" and "Sidecar CPU usage - applications (max. across all sidecars)", which show the maximum memory/CPU used across all the sidecar proxy containers in the emojivoto namespaces for the duration of the run. We report on the maximum values attained for this duration.
Taken from the charts "Memory usage - Service mesh control plane" and "CPU utilisation - Service mesh control plane", which show the sum of the memory/CPU used across all the non-sidecar containers in the control plane namespace for the duration of the run. We report on the maximum values attained for this duration.