Skip to content

Commit

Permalink
Merge pull request #422 from yetanalytics/bench-updates
Browse files Browse the repository at this point in the history
[SQL-251] Bench updates
  • Loading branch information
kelvinqian00 authored Sep 17, 2024
2 parents 21b5bc9 + d58cc58 commit bb93b69
Show file tree
Hide file tree
Showing 9 changed files with 24,577 additions and 23,275 deletions.
10 changes: 5 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -83,15 +83,15 @@ postgres: resources/public/admin # Requires a running Postgres instance
bench:
clojure -M:bench -m lrsql.bench \
-e http://0.0.0.0:8080/xapi/statements \
-i dev-resources/default/insert_input.json \
-q dev-resources/default/query_input.json \
-i dev-resources/bench/insert_input.json \
-q dev-resources/bench/query_input.json \
-u username -p password

bench-async:
clojure -M:bench -m lrsql.bench \
-e http://0.0.0.0:8080/xapi/statements \
-i dev-resources/default/insert_input.json \
-q dev-resources/default/query_input.json \
-i dev-resources/bench/insert_input.json \
-q dev-resources/bench/query_input.json \
-a true \
-u username -p password

Expand Down Expand Up @@ -154,7 +154,7 @@ target/bundle/doc:

target/bundle/bench:
mkdir -p target/bundle/bench
cp -r dev-resources/default/. target/bundle/bench
cp -r dev-resources/bench/. target/bundle/bench

# Copy LICENSE and NOTICE

Expand Down
4 changes: 2 additions & 2 deletions deps.edn
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@
org.clojure/math.numeric-tower {:mvn/version "0.0.5"}
babashka/babashka.curl {:mvn/version "0.0.3"}
com.yetanalytics/datasim
{:mvn/version "0.3.1"
{:mvn/version "0.4.4"
:exclusions [org.clojure/clojure
com.yetanalytics/xapi-schema]}}}
:test
Expand Down Expand Up @@ -119,7 +119,7 @@
:git/sha "8bd5be7816288e85f5c07fc11bf8cf53667e72da"
:exclusions [org.clojure/data.json]}
com.yetanalytics/datasim
{:mvn/version "0.3.0"
{:mvn/version "0.4.4"
:exclusions [org.clojure/clojure
com.yetanalytics/xapi-schema]}}}
;; Build alias invoked like clojure -Xbuild uber
Expand Down
24,490 changes: 24,490 additions & 0 deletions dev-resources/bench/insert_input.json

Large diffs are not rendered by default.

File renamed without changes.
23,183 changes: 0 additions & 23,183 deletions dev-resources/default/insert_input.json

This file was deleted.

14 changes: 7 additions & 7 deletions doc/dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,17 +82,17 @@ The following is the full list of arguments (which can also be accessed by passi
| Argument | Value | Default | Description |
| --- | --- | --- | --- |
| `-e`, `--lrs-endpoint` | URI | <details>`http://0.0.0.0:8080/xapi/statements`<summary>(URI)</summary></details> | The HTTP(S) endpoint of the (SQL) LRS webserver for Statement POSTs and GETs. |
| `-i`, `--insert-input` | Filepath | None | The location of a JSON file containing a DATASIM input spec. If given, this input is used to insert statements into the DB. |
| `-s`, `--input-size` | Integer | `1000` | The total number of statements to insert. Ignored if `-i` is not given. |
| `-b`, `--batch-size` | Integer | `10` | The batch size to use for inserting statements. Ignored if `-i` is not given. |
| `-a`, `--async?` | Boolean | `false` | Whether to insert asynchronously or not. |
| `-c`, `--concurrency` | Integer | `10` | The number of parallel threads to run during statement insertion and querying. Ignored if `-a` is `false`. |
| `-i`, `--insert-input` | Filepath | None | The location of a JSON file containing a DATASIM input spec. If present, this input is used to insert statements into the DB. |
| `-s`, `--input-size` | Integer | `1000` | The total number of statements to insert. Ignored if `-i` is not present. |
| `-b`, `--batch-size` | Integer | `10` | The batch size to use for inserting statements. Ignored if `-i` is not present. |
| `-a`, `--async` | No args | N/A | If provided, insert statements asynchronously. |
| `-c`, `--concurrency` | Integer | `10` | The number of parallel threads to run during statement insertion and querying. Ignored if `-a` is not present. |
| `-r`, `--statement-refs` | Keyword | `none` | How Statement References should be generated and inserted. Valid options are `none` (no Statement References), `half` (half of the Statements have StatementRef objects), and `all` (all Statements have StatementRef objects). |
| `-q`, `--query-input` | Filepath | None | The location of a JSON file containing an array of statement query params. If not given, the benchmark does a single query with no params. |
| `-q`, `--query-input` | Filepath | None | The location of a JSON file containing an array of statement query params. If not present, the benchmark does a single query with no params. |
| `-n`, `--query-number` | Integer | `30` | The number of times each query is performed. |
| `-u`, `--user` | String | None | HTTP Basic Auth user. |
| `-p`, `--pass` | String | None | HTTP Basic Auth password. |
| `-h`, `--help` | No args | None | Help menu. |
| `-h`, `--help` | No args | N/A | Help menu. |

#### 5. Wait for results

Expand Down
135 changes: 64 additions & 71 deletions src/bench/lrsql/bench.clj
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
(ns lrsql.bench
(:require [clojure.core.async :as a]
[clojure.string :refer [join]]
(:require [clojure.core.async :as a]
[clojure.math.numeric-tower :as math]
[clojure.tools.cli :as cli]
[clojure.tools.logging :as log]
[clojure.pprint :as pprint]
[java-time :as jt]
[babashka.curl :as curl]
[com.yetanalytics.datasim.sim :as sim]
[com.yetanalytics.datasim.input :as sim-input]
[lrsql.util :as u])
[clojure.pprint :as pprint]
[clojure.string :as cstr]
[clojure.tools.cli :as cli]
[clojure.tools.logging :as log]
[babashka.curl :as curl]
[java-time.api :as jt]
[com.yetanalytics.datasim :as ds]
[lrsql.util :as u])
(:gen-class))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Expand All @@ -27,27 +26,26 @@
:desc "The HTTP(S) endpoint of the (SQL) LRS webserver for Statement POSTs and GETs."]
["-i" "--insert-input URI" "DATASIM input source"
:id :insert-input
:desc "The location of a JSON file containing a DATASIM input spec. If given, this input is used to insert statements into the DB."]
:desc "The location of a JSON file containing a DATASIM input spec. If present, this input is used to insert statements into the DB."]
["-s" "--input-size LONG" "Size"
:id :insert-size
:parse-fn #(Long/parseLong %)
:default 1000
:desc "The total number of statements to insert. Ignored if `-i` is not given."]
:desc "The total number of statements to insert. Ignored if `-i` is not present."]
["-b" "--batch-size LONG" "Statements per batch"
:id :batch-size
:parse-fn #(Long/parseLong %)
:default 10
:desc "The batch size to use for inserting statements. Ignored if `-i` is not given."]
["-a" "--async? BOOLEAN" "Run asynchronously?"
:id :async?
:parse-fn #(Boolean/parseBoolean %)
:default false
:desc "Whether to insert asynchronously or not."]
:desc "The batch size to use for inserting statements. Ignored if `-i` is not present."]
["-a" "--async" "Run asynchronously?"
:id :async?
:default false
:desc "If provided, insert statements asynchronously."]
["-c" "--concurrency LONG" "Number of threads"
:id :concurrency
:parse-fn #(Long/parseLong %)
:default 10
:desc "The number of parallel threads to run during statement insertion. Ignored if `-a` is `false`."]
:desc "The number of parallel threads to run during statement insertion. Ignored if `-a` is not present."]
["-r" "--statement-refs STRING" "Statement Ref Insertion Type"
:id :statement-ref-type
:parse-fn keyword
Expand All @@ -56,7 +54,7 @@
:desc "How Statement References should be generated and inserted. Valid options are none (no Statement References), half (half of the Statements have StatementRef objects), and all (all Statements have StatementRef objects)."]
["-q" "--query-input URI" "Query input source"
:id :query-input
:desc "The location of a JSON file containing an array of statement query params. If not given, the benchmark does a single query with no params."]
:desc "The location of a JSON file containing an array of statement query params. If not present, the benchmark does a single query with no params."]
["-n" "--query-number LONG" "Query execution number"
:id :query-number
:parse-fn #(Long/parseLong %)
Expand All @@ -77,7 +75,7 @@

(defn read-insert-input
[input-path]
(-> (sim-input/from-location :input :json input-path)
(-> (ds/read-input input-path)
(assoc-in [:parameters :seed] (rand-int 1000000000))))

(defn read-query-input
Expand Down Expand Up @@ -145,18 +143,18 @@

(defmethod generate-statements :none
[inputs size _]
(take size (sim/sim-seq inputs)))
(take size (ds/generate-seq inputs)))

(defmethod generate-statements :half
[inputs size _]
(let [stmt-seq (take size (sim/sim-seq inputs))
(let [stmt-seq (take size (ds/generate-seq inputs))
[tgts refs] (split-at (quot size 2) stmt-seq)
refs' (assoc-stmt-refs tgts refs)]
(concat tgts refs')))

(defmethod generate-statements :all
[inputs size _]
(let [tgts (take size (sim/sim-seq inputs))
(let [tgts (take size (ds/generate-seq inputs))
refs (drop 1 tgts)
refs' (assoc-stmt-refs tgts refs)]
(cons (first tgts) refs')))
Expand All @@ -170,44 +168,36 @@
(jt/as dur :millis)))

(defn store-statements-sync!
[{endpoint :lrs-endpoint
input-uri :insert-input
size :insert-size
[statements
{endpoint :lrs-endpoint
batch-size :batch-size
user :user
pass :pass
sref-type :statement-ref-type}]
(let [inputs (read-insert-input input-uri)
stmts (generate-statements inputs size sref-type)]
(loop [batches (partition-all batch-size stmts)
timings (transient [])]
(if-some [batch (first batches)]
(recur (rest batches)
(conj! timings
(perform-insert
endpoint
{:headers headers
:body (u/write-json-str (vec batch))
:basic-auth [user pass]})))
(let [timings-ret (persistent! timings)]
(calc-statistics timings-ret (count timings-ret)))))))
pass :pass}]
(loop [batches (partition-all batch-size statements)
timings (transient [])]
(if-some [batch (first batches)]
(recur (rest batches)
(conj! timings
(perform-insert
endpoint
{:headers headers
:body (u/write-json-str (vec batch))
:basic-auth [user pass]})))
(let [timings-ret (persistent! timings)]
(calc-statistics timings-ret (count timings-ret))))))

(defn store-statements-async!
[{endpoint :lrs-endpoint
input-uri :insert-input
size :insert-size
[statements
{endpoint :lrs-endpoint
batch-size :batch-size
user :user
pass :pass
sref-type :statement-ref-type
concurrency :concurrency}]
(let [inputs (read-insert-input input-uri)
stmts (generate-statements inputs size sref-type)
requests (mapv (fn [batch]
(let [requests (mapv (fn [batch]
{:headers headers
:body (u/write-json-str (vec batch))
:basic-auth [user pass]})
(partition-all batch-size stmts))
(partition-all batch-size statements))
timings (perform-async-op! perform-insert
endpoint
requests
Expand Down Expand Up @@ -311,24 +301,24 @@
[& args]
(let [{:keys [summary errors]
:as _parsed-opts
{:keys [insert-input
async?
query-number
help
{:keys [help
insert-input
insert-size
statement-ref-type
async?
batch-size
query-number
;; Options that aren't used in `-main` but are later on
_lrs-endpoint
_query-input
_statement-ref-type
_concurrency
_user
_pass]
:as opts} :options}
(cli/parse-opts args cli-options)]
;; Check for errors
(when (not-empty errors)
(log/errorf "CLI Parse Errors:\n%s" (join "\n" errors))
(log/errorf "CLI Parse Errors:\n%s" (cstr/join "\n" errors))
(throw (ex-info "CLI Parse Errors!"
{:type ::cli-parse-error
:errors errors})))
Expand All @@ -338,12 +328,16 @@
(System/exit 0))
;; Store statements
(when insert-input
(log/info "Starting statement insertion...")
(let [store-statements! (if async?
store-statements-async!
store-statements-sync!)
results (store-statements! opts)
_ (log/info "Statement insertion finished.")]
(let [_ (log/info "Starting statement generation...")
inputs (read-insert-input insert-input)
stmts* (generate-statements inputs insert-size statement-ref-type)
stmts (into [] stmts*) ; realize statements
_ (log/info "Statement generation finished.")
_ (log/info "Starting statement insertion...")
results (if async?
(store-statements-async! stmts opts)
(store-statements-sync! stmts opts))
_ (log/info "Statement insertion finished.")]
(printf "\n%s Insert benchmark results for n = %d (in ms) %s\n"
"**********"
(quot insert-size batch-size)
Expand All @@ -352,12 +346,11 @@
:batch-size batch-size}
results)])))
;; Query statements
(log/info "Starting statement query benching...")
(let [query-statements (if async?
query-statements-async
query-statements-sync)
results (query-statements opts)]
(log/info "Statement query benching finished.")
(let [_ (log/info "Starting statement query benching...")
results (if async?
(query-statements-async opts)
(query-statements-sync opts))
_ (log/info "Statement query benching finished.")]
(printf "\n%s Query benchmark results for n = %d (in ms) %s\n"
"**********"
query-number
Expand All @@ -369,7 +362,7 @@
;; Perform benching from the repl
(-main
"-e" "http://localhost:8080/xapi/statements"
"-i" "dev-resources/default/insert_input.json"
"-q" "dev-resources/default/query_input.json"
"-i" "dev-resources/bench/insert_input.json"
"-q" "dev-resources/bench/query_input.json"
"-a" "true" "-c" "20"
"-u" "username" "-p" "password"))
6 changes: 4 additions & 2 deletions src/test/lrsql/concurrency_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,18 @@
;; Helpers
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; We reuse bench inputs for tests here.

(defn test-statements
[num-stmts]
(->> "dev-resources/default/insert_input.json"
(->> "dev-resources/bench/insert_input.json"
(sim-input/from-location :input :json)
sim/sim-seq
(take num-stmts)
(into [])))

(def test-queries
(-> "dev-resources/default/query_input.json"
(-> "dev-resources/bench/query_input.json"
slurp
(u/parse-json :object? false)))

Expand Down
10 changes: 5 additions & 5 deletions src/test/lrsql/lrs_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@
(:require [clojure.test :refer [deftest testing is use-fixtures]]
[clojure.string :as cstr]
[com.stuartsierra.component :as component]
[com.yetanalytics.datasim.input :as sim-input]
[com.yetanalytics.datasim.sim :as sim]
[com.yetanalytics.datasim :as ds]
[com.yetanalytics.lrs.protocol :as lrsp]
[lrsql.admin.protocol :as adp]
[lrsql.test-support :as support]
Expand Down Expand Up @@ -905,11 +904,12 @@
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; Taken from lrs and third lib tests
;; We reuse bench resources for tests here.

(def test-statements
(->> "dev-resources/default/insert_input.json"
(sim-input/from-location :input :json)
sim/sim-seq
(->> "dev-resources/bench/insert_input.json"
ds/read-input
ds/generate-seq
(take 50)
(into [])))

Expand Down

0 comments on commit bb93b69

Please sign in to comment.