ONSdigital · CharlesRendle · Nov 9, 2022 · Sep 26, 2022 · Sep 30, 2022 · Sep 30, 2022
diff --git a/tests/stress/README.md b/tests/stress/README.md
@@ -0,0 +1,94 @@
+# Stress Testing
+
+## Intro
+To stress test csvcubed, we use a tool called JMeter to run two separate test plans for the build command and inspect command respectively.  
+The test plans are run consecutively via a bash script and follow broadly the same structure:
+
+- Start running a performance monitor in the background
+- Generate a "maximally complex" CSV file with the number of rows the user has given as a parameter to the bash script.
+  1. This is a preprocess which is a groovy script which in turn runs a python script because JMeter will not execute `.py` files.
+      - The inspect command test then runs 1 build command on this CSV file so that a `.csv-metadata.json` file can be used.
+- Run the test's designated command 5 times using the results of the preprocess step above as inputs.
+  1. The performance monitor records CPU and Memory usage throughout the duration of the test's 5 runs.
+- A postprocess removes any unwanted resultant files.
+  1. Also a groovy script which in turn runs a python script.
+- Steps 2 - 4 are repeated for the second command
+- The performance monitor is terminated.
+- Finally, metrics files are cleaned up and placed into relevant folders.
+
+## Installation Guide
+- Install JMeter: https://jmeter.apache.org/download_jmeter.cgi
+  1. Choose either of the binaries to download
+  2. Once extracted in a location of your choosing, navigate to the `/bin` folder
+      - Add this full path to the `/bin` folder to your system's path variables
+      - Once added, reload your terminal for the changes to be recognised
+
+- Install a Java DK: https://www.oracle.com/java/technologies/downloads/
+  1. JMeter requires any JDK version 8+ but the latest is recommended
+
+- Install the JMeter Plugins Manager: https://jmeter-plugins.org/wiki/PluginsManager/
+  1 Place the downloaded file into JMeter's `lib/ext` directory
+
+- Install the PerfMon Server Agent: https://github.com/undera/perfmon-agent
+  1. Unzip to a location of your choosing
+  2. Once extracted, add the path to the ServerAgent directory to your system's path variables
+  3. Once added, reload your terminal for the changes to be recognised
+
+- Install the PerfMon plugin for JMeter
+  1. Open JMeter in GUI mode by opening a terminal and running > `jmeter.sh`
+  2. It may take a few seconds before the window opens
+  3. If nothing happens or there is an error, ensure you have correctly installed JMeter and added the location of JMeter's `/bin` folder to your path variables
+  4. Once in JMeter go to `Options > Plugins Manager`
+  5. In the new window, select the "Available" tab and look for `"PerfMon (Servers Performance Monitoring)"` in the list of plugins
+  6. Tick the box for this plugin and select `Apply All Changes and Restart`
+  7. Once restarted you can close the JMeter GUI window
+
+## Running the Tests
+- Open a terminal
+- Navigate to `csvcubed/tests/stress`
+- Run > `./stresstest.sh x`
+  1. Where `x` represents the number of rows you wish to use in stress testing. E.g. > `./stresstest.sh 10000`
+- When run, you should see:
+  1. Some startup information from the PerfMon Server Agent
+  2. Information as the first test is running - this may take some time to complete depending on the number of rows used in the test 
+      - This will repeat for the second test and take a very similar appearance directly underneath
+  3. Finally, upon completion, some key metrics from each test are printed to the terminal
+      - Your times may differ
+      - This will also repeat for the results of the inspect command test and prints directly below  
+
+**Note** - The tests can be run using the JMeter GUI, however this is not recommended when trying to record accurate stress testing results.
+if you wish to run the tests using the GUI (probably for debugging changes to the JMeter test plans):
+
+- Open JMeter in GUI mode by opening a terminal and running > `jmeter.sh`
+- Open the test plan you wish to run. E.g. `buildcommandtest.jmx` or `inspectcommandtest.jmx`
+- Press the green play button in the top bar to start the test
+- When debugging, it can be useful to:
+  1. Open the live logs by clicking on the exclamation mark in a yellow triangle in the upper right corner
+      - You can clear the logs by clicking the cog+broom button in the top bar
+  2. Add a `View Results Tree` listener to the test plan
+      - Right click on test plan in the top left of the sidebar
+      - Hover over `> Add > Listeners > View Results Tree`
+
+By default, tests run in GUI mode will use CSVs containing 10 rows. If you need to change this, you must do so manually
+Open the groovy file associated with the test's preprocess. E.g. `buildpreprocess.groovy` or `inspectpreprocess.groovy`
+Change the default value defined in `line 26`
+
+When you are finished using JMeter in GUI mode, please make sure you remove any results tree listeners which were added as they may have an effect on
+performance during Non-GUI mode execution.
+
+## Results
+As well as printing times, maximum and average values to the terminal, each run of the stress tests will generate a new folder in the csvcubed/tests/stress/metrics directory
+marked by the timestamp from when the test was initiated. Inside this folder there will be 3 files:
+
+- Buildmetrics-timestamp.csv
+- Inspectmetrics-timestamp.csv
+- jmeter.log
+    1. Where timestamp is replaced by the time at which the first metric was recorded for that particular test  
+
+The two .CSV files contain metric values and the times at which they were recorded for each test and should be the starting point for deeper analysis.  
+
+The jmeter.log file contains the logs from the last test plan to be executed. This should be the inspectcommand.jmx script unless something went wrong during execution 
+of the buildcommand.jmx script.
+
+
+
diff --git a/tests/stress/__init__.py b/tests/stress/__init__.py
diff --git a/tests/stress/buildcommandtest.jmx b/tests/stress/buildcommandtest.jmx
@@ -0,0 +1,157 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<jmeterTestPlan version="1.2" properties="5.0" jmeter="5.5">
+  <hashTree>
+    <TestPlan guiclass="TestPlanGui" testclass="TestPlan" testname="Test Plan" enabled="true">
+      <stringProp name="TestPlan.comments"></stringProp>
+      <boolProp name="TestPlan.functional_mode">false</boolProp>
+      <boolProp name="TestPlan.tearDown_on_shutdown">true</boolProp>
+      <boolProp name="TestPlan.serialize_threadgroups">true</boolProp>
+      <elementProp name="TestPlan.user_defined_variables" elementType="Arguments" guiclass="ArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
+        <collectionProp name="Arguments.arguments"/>
+      </elementProp>
+      <stringProp name="TestPlan.user_define_classpath"></stringProp>
+    </TestPlan>
+    <hashTree>
+      <Arguments guiclass="ArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
+        <collectionProp name="Arguments.arguments">
+          <elementProp name="SCRIPT_PATH" elementType="Argument">
+            <stringProp name="Argument.name">SCRIPT_PATH</stringProp>
+            <stringProp name="Argument.value">${__BeanShell(import org.apache.jmeter.services.FileServer; FileServer.getFileServer().getBaseDir();)}</stringProp>
+            <stringProp name="Argument.metadata">=</stringProp>
+          </elementProp>
+          <elementProp name="ROWS" elementType="Argument">
+            <stringProp name="Argument.name">ROWS</stringProp>
+            <stringProp name="Argument.value">${__P(rows,)}</stringProp>
+            <stringProp name="Argument.metadata">=</stringProp>
+          </elementProp>
+        </collectionProp>
+      </Arguments>
+      <hashTree/>
+      <JSR223PreProcessor guiclass="TestBeanGUI" testclass="JSR223PreProcessor" testname="JSR223 PreProcessor" enabled="true">
+        <stringProp name="scriptLanguage">groovy</stringProp>
+        <stringProp name="parameters"> ${ROWS}</stringProp>
+        <stringProp name="filename">${SCRIPT_PATH}/buildpreprocess.groovy</stringProp>
+        <stringProp name="cacheKey">true</stringProp>
+        <stringProp name="script"></stringProp>
+      </JSR223PreProcessor>
+      <hashTree/>
+      <ThreadGroup guiclass="ThreadGroupGui" testclass="ThreadGroup" testname="csvcubed build command Thread Group" enabled="true">
+        <stringProp name="ThreadGroup.on_sample_error">continue</stringProp>
+        <elementProp name="ThreadGroup.main_controller" elementType="LoopController" guiclass="LoopControlPanel" testclass="LoopController" testname="Loop Controller" enabled="true">
+          <boolProp name="LoopController.continue_forever">false</boolProp>
+          <stringProp name="LoopController.loops">5</stringProp>
+        </elementProp>
+        <stringProp name="ThreadGroup.num_threads">1</stringProp>
+        <stringProp name="ThreadGroup.ramp_time">1</stringProp>
+        <boolProp name="ThreadGroup.scheduler">false</boolProp>
+        <stringProp name="ThreadGroup.duration"></stringProp>
+        <stringProp name="ThreadGroup.delay"></stringProp>
+        <boolProp name="ThreadGroup.same_user_on_next_iteration">true</boolProp>
+        <boolProp name="ThreadGroup.delayedStart">true</boolProp>
+      </ThreadGroup>
+      <hashTree>
+        <SystemSampler guiclass="SystemSamplerGui" testclass="SystemSampler" testname="csvcubed build sampler" enabled="true">
+          <boolProp name="SystemSampler.checkReturnCode">true</boolProp>
+          <stringProp name="SystemSampler.expectedReturnCode">0</stringProp>
+          <stringProp name="SystemSampler.command">csvcubed</stringProp>
+          <elementProp name="SystemSampler.arguments" elementType="Arguments" guiclass="ArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
+            <collectionProp name="Arguments.arguments">
+              <elementProp name="" elementType="Argument">
+                <stringProp name="Argument.name"></stringProp>
+                <stringProp name="Argument.value">build</stringProp>
+                <stringProp name="Argument.metadata">=</stringProp>
+              </elementProp>
+              <elementProp name="" elementType="Argument">
+                <stringProp name="Argument.name"></stringProp>
+                <stringProp name="Argument.value">stress.csv</stringProp>
+                <stringProp name="Argument.metadata">=</stringProp>
+              </elementProp>
+              <elementProp name="" elementType="Argument">
+                <stringProp name="Argument.name"></stringProp>
+                <stringProp name="Argument.value">-c</stringProp>
+                <stringProp name="Argument.metadata">=</stringProp>
+              </elementProp>
+              <elementProp name="" elementType="Argument">
+                <stringProp name="Argument.name"></stringProp>
+                <stringProp name="Argument.value">${SCRIPT_PATH}/test-qube-config.json</stringProp>
+                <stringProp name="Argument.metadata">=</stringProp>
+              </elementProp>
+            </collectionProp>
+          </elementProp>
+          <elementProp name="SystemSampler.environment" elementType="Arguments" guiclass="ArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
+            <collectionProp name="Arguments.arguments"/>
+          </elementProp>
+          <stringProp name="SystemSampler.directory">${SCRIPT_PATH}/temp_dir</stringProp>
+        </SystemSampler>
+        <hashTree/>
+      </hashTree>
+      <kg.apc.jmeter.perfmon.PerfMonCollector guiclass="kg.apc.jmeter.vizualizers.PerfMonGui" testclass="kg.apc.jmeter.perfmon.PerfMonCollector" testname="Build Command Metrics" enabled="true">
+        <boolProp name="ResultCollector.error_logging">false</boolProp>
+        <objProp>
+          <name>saveConfig</name>
+          <value class="SampleSaveConfiguration">
+            <time>true</time>
+            <latency>true</latency>
+            <timestamp>true</timestamp>
+            <success>true</success>
+            <label>true</label>
+            <code>true</code>
+            <message>true</message>
+            <threadName>true</threadName>
+            <dataType>true</dataType>
+            <encoding>false</encoding>
+            <assertions>true</assertions>
+            <subresults>true</subresults>
+            <responseData>false</responseData>
+            <samplerData>false</samplerData>
+            <xml>false</xml>
+            <fieldNames>true</fieldNames>
+            <responseHeaders>false</responseHeaders>
+            <requestHeaders>false</requestHeaders>
+            <responseDataOnError>false</responseDataOnError>
+            <saveAssertionResultsFailureMessage>true</saveAssertionResultsFailureMessage>
+            <assertionsResultsToSave>0</assertionsResultsToSave>
+            <bytes>true</bytes>
+            <sentBytes>true</sentBytes>
+            <url>true</url>
+            <threadCounts>true</threadCounts>
+            <idleTime>true</idleTime>
+            <connectTime>true</connectTime>
+          </value>
+        </objProp>
+        <stringProp name="filename">${SCRIPT_PATH}/buildmetrics.csv</stringProp>
+        <longProp name="interval_grouping">1000</longProp>
+        <boolProp name="graph_aggregated">false</boolProp>
+        <stringProp name="include_sample_labels"></stringProp>
+        <stringProp name="exclude_sample_labels"></stringProp>
+        <stringProp name="start_offset"></stringProp>
+        <stringProp name="end_offset"></stringProp>
+        <boolProp name="include_checkbox_state">false</boolProp>
+        <boolProp name="exclude_checkbox_state">false</boolProp>
+        <collectionProp name="metricConnections">
+          <collectionProp name="917712290">
+            <stringProp name="-1204607085">localhost</stringProp>
+            <stringProp name="1600768">4444</stringProp>
+            <stringProp name="66952">CPU</stringProp>
+            <stringProp name="0"></stringProp>
+          </collectionProp>
+          <collectionProp name="-1383002031">
+            <stringProp name="-1204607085">localhost</stringProp>
+            <stringProp name="1600768">4444</stringProp>
+            <stringProp name="-1993889503">Memory</stringProp>
+            <stringProp name="0"></stringProp>
+          </collectionProp>
+        </collectionProp>
+      </kg.apc.jmeter.perfmon.PerfMonCollector>
+      <hashTree/>
+      <JSR223PostProcessor guiclass="TestBeanGUI" testclass="JSR223PostProcessor" testname="JSR223 PostProcessor" enabled="true">
+        <stringProp name="cacheKey">true</stringProp>
+        <stringProp name="filename">${SCRIPT_PATH}/postprocess.groovy</stringProp>
+        <stringProp name="parameters"></stringProp>
+        <stringProp name="script"></stringProp>
+        <stringProp name="scriptLanguage">groovy</stringProp>
+      </JSR223PostProcessor>
+      <hashTree/>
+    </hashTree>
+  </hashTree>
+</jmeterTestPlan>
diff --git a/tests/stress/buildpreprocess.groovy b/tests/stress/buildpreprocess.groovy
@@ -0,0 +1,31 @@
+import org.apache.jmeter.services.FileServer
+
+void runProcessJMeter(String command) {
+	def baseDir = FileServer.getFileServer().getBaseDir()
+	log.info("This is the baseDir: ${baseDir}")
+
+	def proc = command.execute(null, new File(baseDir))
+
+	def b = new StringBuffer()
+	proc.consumeProcessErrorStream(b)
+
+	def statusCode = proc.waitFor()
+	if (statusCode != 0) {
+		log.error("Error occurred: ${b}")
+		throw new Exception("Script failed.")	
+	}
+	def textOut = proc.text
+	log.info("Status code: ${statusCode}")
+
+	log.info("Found the following: ${textOut}");
+}
+
+// Assign the number of rows to be tested as either 10
+// when running the .jmx files in GUI mode or as 
+// whatever value is supplied to the bash script.
+def rows = 10
+if (args) {
+	rows = args[0]
+}
+
+runProcessJMeter("python3 buildpreprocess.py ${rows}")
diff --git a/tests/stress/buildpreprocess.py b/tests/stress/buildpreprocess.py
@@ -0,0 +1,79 @@
+import csv
+import sys
+from pathlib import Path
+
+# this program will generate a csv file with a predefined number of colums and rows (preferabli each value unique)
+
+
+def generate_maximally_complex_csv(
+    numb_rows: int, temp_dir: Path = Path("temp_dir"), max_num_measures: int = 20
+):
+
+    temp_dir.mkdir(exist_ok=True)
+
+    # filling up the csv file with random unique data for testing
+    with open(temp_dir / "stress.csv", "w+", newline="") as f:
+        the_writer = csv.writer(f)
+
+        # creating arrays to temporarely hold the data that will be placed in the csv file
+        column_array = []
+        unique_number = 0
+        measure_number = 0
+
+        column_array = [
+            "Dim1",
+            "Dim2",
+            "Dim3",
+            "Dim4",
+            "Dim5",
+            "Dim6",
+            "Dim7",
+            "Dim8",
+            "Dim9",
+            "Dim10",
+            "Attribute1",
+            "Attribute2",
+            "Attribute3",
+            "Attribute4",
+            "obs",
+            "Measure",
+            "Unit",
+        ]
+
+        the_writer.writerow(column_array)
+
+        # this for loop will append each rows maching the number of colums
+        for _ in range(1, numb_rows + 1):
+            rows_array = []
+
+            for i in range(0, len(column_array)):
+                unique_number += 1
+                if i < 10:
+                    row_value = "A Dimension" + str(unique_number)
+                    rows_array.append(row_value)
+                elif i == 14:
+                    row_value = (
+                        unique_number * 2
+                    )  # This extra step is only to make the value more unique
+                    rows_array.append(row_value)
+                elif i == 15:
+                    measure_number += 1
+                    row_value = "A measure" + str(measure_number)
+                    rows_array.append(row_value)
+                    if measure_number > (max_num_measures - 1):
+                        measure_number = 0
+                elif i == 16:
+                    row_value = "some Unit" + str(unique_number)
+                    rows_array.append(row_value)
+                else:
+                    row_value = "value" + str(unique_number)
+                    rows_array.append(row_value)
+
+            the_writer.writerow(rows_array)
+
+
+if __name__ == "__main__":
+    # taking in a commandline argument to determine the number of rows
+    numb_rows = int(sys.argv[1])
+
+    generate_maximally_complex_csv(numb_rows)