diff --git a/Documentation/op-guide/performance.md b/Documentation/op-guide/performance.md
index 4a6a70e6dbf..926caf327e0 100644
--- a/Documentation/op-guide/performance.md
+++ b/Documentation/op-guide/performance.md
@@ -17,58 +17,54 @@ For some baseline performance numbers, we consider a three member etcd cluster w
 - Google Cloud Compute Engine
 - 3 machines of 8 vCPUs + 16GB Memory + 50GB SSD
 - 1 machine(client) of 16 vCPUs + 30GB Memory + 50GB SSD
-- Ubuntu 15.10
-- etcd v3 master branch (commit SHA d8f325d), Go 1.6.2
+- Ubuntu 17.04
+- etcd 3.2.0, go 1.8.3
 
 With this configuration, etcd can approximately write:
 
-| Number of keys | Key size in bytes | Value size in bytes | Number of connections | Number of clients | Target etcd server | Average write QPS | Average latency per request | Memory |
-|----------------|-------------------|---------------------|-----------------------|-------------------|--------------------|-------------------|-----------------------------|--------|
-| 10,000 | 8 | 256 | 1 | 1 | leader only | 525 | 2ms | 35 MB |
-| 100,000 | 8 | 256 | 100 | 1000 | leader only | 25,000 | 30ms | 35 MB |
-| 100,000 | 8 | 256 | 100 | 1000 | all members | 33,000 | 25ms | 35 MB |
+| Number of keys | Key size in bytes | Value size in bytes | Number of connections | Number of clients | Target etcd server | Average write QPS | Average latency per request | Average server RSS |
+|---------------:|------------------:|--------------------:|----------------------:|------------------:|--------------------|------------------:|----------------------------:|-------------------:|
+| 10,000 | 8 | 256 | 1 | 1 | leader only | 583 | 1.6ms | 48 MB |
+| 100,000 | 8 | 256 | 100 | 1000 | leader only | 44,341 | 22ms |  124MB |
+| 100,000 | 8 | 256 | 100 | 1000 | all members |  50,104 | 20ms |  126MB |
 
 Sample commands are:
 
-```
-# assuming IP_1 is leader, write requests to the leader
-benchmark --endpoints={IP_1} --conns=1 --clients=1 \
+```sh
+# write to leader
+benchmark --endpoints=${HOST_1} --target-leader --conns=1 --clients=1 \
     put --key-size=8 --sequential-keys --total=10000 --val-size=256
-benchmark --endpoints={IP_1} --conns=100 --clients=1000 \
+benchmark --endpoints=${HOST_1} --target-leader  --conns=100 --clients=1000 \
     put --key-size=8 --sequential-keys --total=100000 --val-size=256
 
 # write to all members
-benchmark --endpoints={IP_1},{IP_2},{IP_3} --conns=100 --clients=1000 \
+benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=100 --clients=1000 \
     put --key-size=8 --sequential-keys --total=100000 --val-size=256
 ```
 
 Linearizable read requests go through a quorum of cluster members for consensus to fetch the most recent data. Serializable read requests are cheaper than linearizable reads since they are served by any single etcd member, instead of a quorum of members, in exchange for possibly serving stale data. etcd can read: 
 
-| Number of requests | Key size in bytes | Value size in bytes | Number of connections | Number of clients | Consistency | Average latency per request | Average read QPS |
-|--------------------|-------------------|---------------------|-----------------------|-------------------|-------------|-----------------------------|------------------|
-| 10,000 | 8 | 256 | 1 | 1 | Linearizable | 2ms | 560 |
-| 10,000 | 8 | 256 | 1 | 1 | Serializable | 0.4ms | 7,500 |
-| 100,000 | 8 | 256 | 100 | 1000 | Linearizable | 15ms | 43,000 |
-| 100,000 | 8 | 256 | 100 | 1000 | Serializable | 9ms | 93,000 |
+| Number of requests | Key size in bytes | Value size in bytes | Number of connections | Number of clients | Consistency | Average read QPS | Average latency per request |
+|-------------------:|------------------:|--------------------:|----------------------:|------------------:|-------------|-----------------:|----------------------------:|
+| 10,000 | 8 | 256 | 1 | 1 | Linearizable | 1,353 | 0.7ms |
+| 10,000 | 8 | 256 | 1 | 1 | Serializable | 2,909 | 0.3ms |
+| 100,000 | 8 | 256 | 100 | 1000 | Linearizable | 141,578 | 5.5ms |
+| 100,000 | 8 | 256 | 100 | 1000 | Serializable | 185,758 | 2.2ms |
 
 Sample commands are:
 
-```
-# Linearizable read requests
-benchmark --endpoints={IP_1},{IP_2},{IP_3} --conns=1 --clients=1 \
+```sh
+# Single connection read requests
+benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=1 --clients=1 \
     range YOUR_KEY --consistency=l --total=10000
-benchmark --endpoints={IP_1},{IP_2},{IP_3} --conns=100 --clients=1000 \
-    range YOUR_KEY --consistency=l --total=100000
+benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=1 --clients=1 \
+    range YOUR_KEY --consistency=s --total=10000
 
-# Serializable read requests for each member and sum up the numbers
-for endpoint in {IP_1} {IP_2} {IP_3}; do
-    benchmark --endpoints=$endpoint --conns=1 --clients=1 \
-        range YOUR_KEY --consistency=s --total=10000
-done
-for endpoint in {IP_1} {IP_2} {IP_3}; do
-    benchmark --endpoints=$endpoint --conns=100 --clients=1000 \
-        range YOUR_KEY --consistency=s --total=100000
-done
+# Many concurrent read requests
+benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=100 --clients=1000 \
+    range YOUR_KEY --consistency=l --total=100000
+benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=100 --clients=1000 \
+    range YOUR_KEY --consistency=s --total=100000
 ```
 
-We encourage running the benchmark test when setting up an etcd cluster for the first time in a new environment to ensure the cluster achieves adequate performance; cluster latency and throughput can be sensitive to minor environment differences.
\ No newline at end of file
+We encourage running the benchmark test when setting up an etcd cluster for the first time in a new environment to ensure the cluster achieves adequate performance; cluster latency and throughput can be sensitive to minor environment differences.