Charles rendle/issue563 #622

CharlesRendle · 2022-10-17T12:22:19Z

This PR adds two scripts to be used when stress testing csvcubed along with the setup and cleanup scripts to address ticket #563

The user must have some essential tooling installed which can be found with an installation guide on confluence

Copied from the confluence page, the stress test follows this structure:

Start running a performance monitor in the background
Generate a "maximally complex" CSV file with the number of rows the user has given as a parameter to the bash script.
The inspect command test then runs 1 build command on this CSV file so that a .csv-metadata.json file can be used.
Run the test's designated command 5 times using the results of the preprocess step above as inputs.
The performance monitor records CPU and Memory usage throughout the duration of the test's 5 runs.
A postprocess removes any unwanted resultant files.
Steps 2 - 4 are repeated for the second command
The metrics files are cleaned up and placed into relevant folders.
Finally, the performance monitor is terminated.

The stresstest has currently only been run locally with plans to run via a VM in the future but for now that aspect has been removed from the ticket.

If there is anything which needs further explanation (no doubt..) please get in contact with myself or @nimshi89

…csvcubed into CharlesRendle/issue563

GDonRanasinghe

@CharlesRendle I've left a few comments.

GDonRanasinghe · 2022-11-03T13:43:48Z

tests/stress/README.md

+- Start running a performance monitor in the background
+- Generate a "maximally complex" CSV file with the number of rows the user has given as a parameter to the bash script.
+  1. This is a preprocess which is a groovy script which in turn runs a python script because JMeter will not execute `.py` files.
+      - The inspect command test then runs 1 build command on this CSV file so that a `.csv-metadata.json` file can be used.


Meant to say "The inspect command test then runs the build command on this CSV file so that the json-ld output generated by the build command can be used as the input to the inspect command"?

Changed wording. This is far more succinct - thank you

GDonRanasinghe · 2022-11-03T13:47:06Z

tests/unit/stress/test_preprocesses.py

+        assert (tmp_dir / "stress.csv").exists()
+
+
+def test_generated_csv_shape_and_num_unique_values():


Why not use assert_frame_equal provided by pandas test utils to assert the entire data frame?

Reworked test so that it uses assert_frame_equals. I have kept in some of the original assertions because I think it will be useful to see if these specific checks fail due to future changes.

GDonRanasinghe · 2022-11-03T13:49:04Z

tests/stress/metrics_converter.py

+    average_time = temp_array[len(temp_array) - 1] / n_runs
+    total_test_time = timedelta(seconds=temp_array[len(temp_array) - 1])
+
+    max_value_cpu = round(df2["localhost CPU"].max(), 2)


Will we have the same values when these tests run on a different env (e.g. on a pipeline or on user's machine - linux/windows)?

The stress test results I would expect to be different when run in a different environment + on different hardware. This portion of the ticket had been pushed to outside of scope for generating preliminary results. Perhaps this will form part of ticket #563
 For the unit tests purpose, we use a prewritten metrics file from which we have calculated the expected results and use this as our comparison.

robons · 2022-11-08T09:29:02Z

Here's a script you can use to set everything up on OS X:

#!/bin/bash

# Script to install jmeter and associated tools necessary for running the stress test on OSX.

brew install jmeter wget

# Install perfmon ServerAgent
SERVER_AGENT_VERSION="2.2.3"
SERVER_AGENT_ID="ServerAgent-$SERVER_AGENT_VERSION"
wget "https://github.com/undera/perfmon-agent/releases/download/$SERVER_AGENT_VERSION/$SERVER_AGENT_ID.zip"
unzip "$SERVER_AGENT_ID.zip"
rm "$SERVER_AGENT_ID.zip"
chmod +x "$SERVER_AGENT_ID/startAgent.sh"
mv "$SERVER_AGENT_ID" /usr/local/share
echo "#\!/bin/bash\n/usr/local/share/$SERVER_AGENT_ID/startAgent.sh \$@" > /usr/local/bin/startAgent.sh
chmod +x /usr/local/bin/startAgent.sh

# Installing perfmon plugin for jmeter
PLUGIN_VERSION="2.1"
PLUGIN_IDENTIFIER="jpgc-perfmon-$PLUGIN_VERSION"

wget "https://jmeter-plugins.org/files/packages/$PLUGIN_IDENTIFIER.zip"
# Add plugin to jmeter installation directory
JMETER_INSTALLATION_DIR="$(brew --prefix jmeter)"
unzip "$PLUGIN_IDENTIFIER.zip" -d "$JMETER_INSTALLATION_DIR/libexec"
rm "$PLUGIN_IDENTIFIER.zip"

Could you include this somewhere in your documentation?

robons

Really good work. I know this ticket was a tough one, but you've got something that's useful and will help us validate whether our future performance improvements are effective or not.

I imagine you've learnt quite a bit in this ticket too.

Take a look at my comments before merging. They're mostly about clarifying things in code comments.

robons · 2022-11-08T09:35:07Z

tests/stress/README.md

+
+- Buildmetrics-timestamp.csv
+- Inspectmetrics-timestamp.csv
+- jmeter.log


Isn't this jmeter.Build.log and jmeter.Inspect.log now?

robons · 2022-11-08T09:38:40Z

tests/stress/README.md

+- Generate a "maximally complex" CSV file with the number of rows the user has given as a parameter to the bash script.
+  1. This is a preprocess which is a groovy script which in turn runs a python script because JMeter will not execute `.py` files.
+      - The inspect command test then runs the build command on this CSV file so that the json-ld output generated by the build command can be used as the input to the inspect command.
+- Run the test's designated command 5 times using the results of the preprocess step above as inputs.


Can you make sure to mention that they're run in series and not in parallel?

robons · 2022-11-08T09:46:32Z

tests/stress/README.md

+## Installation Guide
+
+### From Bash Script
+  #!/bin/bash


You should put this in back ticks (`) to ensure it's easy to copy and paste and isn't subject to formatting. Also can you make it clear to the user that this script will only work on OSX? It won't work in any other environment which might confuse someone.

robons · 2022-11-08T09:48:50Z

tests/stress/buildpreprocess.py

+    numb_rows: int, temp_dir: Path = Path("temp_dir"), max_num_measures: int = 20
+):
+
+    temp_dir.mkdir(exist_ok=True)


Could you add a comment to make it clear that the output of this function is to be paired with test-qube-config.json to ensure we correctly identify the type of each column when we build the CSV-W?

robons · 2022-11-08T09:50:51Z

tests/stress/metrics_converter.py

+    data = pd.read_csv(csv_metrics_in)
+
+    if data.shape[0] == 1:
+        raise IndexError(


Nice choice of exception type.

robons · 2022-11-08T09:58:21Z

Oh, and add a follow up ticket to make sure we cover the limited RAM/CPU VM testing we wanted to look in to.

github-actions · 2022-11-09T10:01:58Z

ubuntu-latest-python3.9 test results

360 tests +5 360 ✔️ +5 3m 44s ⏱️ +45s
    8 suites ±0     0 💤 ±0
    8 files ±0     0 ❌ ±0

Results for commit 259bd85. ± Comparison against base commit 77501e1.

This pull request removes 70 and adds 68 tests. Note that renamed tests count towards both.

CatalogMetadata.Testing CatalogMetadata ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
ConfigSchema.Testing cube from config json ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]A QbCube configured by convention should contain appropriate datatypes
[GCC 9.4.0]A QbCube configured with Standard URI style should include file endings in URIs
[GCC 9.4.0]A QbCube configured with WithoutFileExtensions URI style should exclude file endings in URIs
[GCC 9.4.0]A QbCube should fail to validate where foreign key constraints are not met.
[GCC 9.4.0]A QbCube should generate appropriate DCAT Metadata
[GCC 9.4.0]A QbCube should generate csvcubed version specific rdf
[GCC 9.4.0]A QbCube should validate successfully where foreign key constraints are met.
[GCC 9.4.0]A QbCube which references a legacy composite code list should pass all tests
…

CatalogMetadata.Testing CatalogMetadata ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]This should succeed when multiple landing pages are supported
ConfigSchema.Testing cube from config json ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]This should succeed when a a valid csv and config.json file are provided
cli.Test the csvcubed Command Line Interface. ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]The csvcubed build command should output validation errors file
cube.Cube! ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]Output a cube and errors when created data only
cube.Cube! ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]Output a cube and errors when created from both config and data
cube.Cube! ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]Output a cube combining config and convention
cube.Cube! ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]Output a cube when an inline code list is defined using code list config schema v1.0 and when there are references to concepts defined elsewhere.
cube.Cube! ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]Output a cube when an inline code list is defined using code list config schema v1.0, and the sort order is defined with sort object
cube.Cube! ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]Output a cube when the code list is defined using code list config schema v1.0 and when the concepts are hierarchical, and the sort order is defined with sort object
cube.Cube! ‑ py3.9.15 (main, Oct 18 2022, 07:15:17) 
[GCC 9.4.0]Output a cube when the code list is defined using code list config schema v1.0 and when the concepts are not hierarchical, and the sort order is defined with sort object
…

♻️ This comment has been updated with latest results.

github-actions · 2022-11-09T10:02:25Z

ubuntu-latest-python3.10 test results

360 tests +5 360 ✔️ +5 2m 42s ⏱️ - 1m 21s
    8 suites ±0     0 💤 ±0
    8 files ±0     0 ❌ ±0

Results for commit 259bd85. ± Comparison against base commit 77501e1.

♻️ This comment has been updated with latest results.

github-actions · 2022-11-09T10:02:38Z

windows-latest-python3.10 test results

    9 files ±0   10 suites ±0 6m 58s ⏱️ -12s
374 tests +5 374 ✔️ +5 0 💤 ±0 0 ❌ ±0
387 runs +5 387 ✔️ +5 0 💤 ±0 0 ❌ ±0

Results for commit 259bd85. ± Comparison against base commit 77501e1.

♻️ This comment has been updated with latest results.

sonarcloud · 2022-11-09T16:59:54Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

0.0% Coverage
0.0% Duplication

CharlesRendle and others added 30 commits September 26, 2022 10:16

Dump first commit

cf9a20d

Jozsef test script + change to grrovy file

5cccdbc

progress commit

55e10ec

modified the test.py

69929ab

Progress commit monday morning!

ca89429

progress commit monday afternoon.

08e253f

Big structural change commit.

d01715e

Merge branch 'CharlesRendle/issue563' of https://github.com/GSS-Cogs/…

2921df1

…csvcubed into CharlesRendle/issue563

updating the files and deletions

a35cc87

deleted a load of not needed files

b0c4e56

Progress commit 07.10.22 - Jozsef can fetch

3078ba1

Chnaged the path file for the stress test

2d9f1bd

Merge branch 'CharlesRendle/issue563' of https://github.com/GSS-Cogs/…

3ba5ad1

…csvcubed into CharlesRendle/issue563

Split up tests, tdl: run bulid in inspect pretest

d05ad4c

Progress commit

9777b2a

Created the metrics converter script

6a65ef3

Whole tests run from bash script! Cleanup imminent

592c605

Removes old attempts

7dcc1cb

Adds jmeter log + metrics to their own folders

6748ec7

Tidy of imports

e8cb034

Changes generated csv filename back to stress.csv

d123ac9

Unit testing files

db0a3d9

changes to the preprocess - can start functionise

77877e7

removes duplication in inspectpreprocess

cf27d7d

changes made to the scripts to reduce duplication

62ecf9c

Progress commit

4156ab8

Progress commit

17e01da

commit before breaking anything else...

13f0689

Stresstest working. tdl: diagnose time gaps

b3e4487

Fixes test by adding run_type to log title

4958de8

CharlesRendle added 3 commits November 1, 2022 17:25

Rounds printed values, edits readme, comment tests

3aad1e1

Adds test .log file to test cases

6e31384

Removes an unused variable name (uses _ instead)

01a8c79

GDonRanasinghe reviewed Nov 3, 2022

View reviewed changes

CharlesRendle added 2 commits November 7, 2022 11:56

Addresses PR commens #1

de1f69a

Changes import of test-case

b7accca

CharlesRendle added 3 commits November 8, 2022 09:44

Added installation script to readme. Iffy mkdwn

295591c

triple backtick for codeblock

831bc90

Adds OS X distinction

7ae80ed

robons approved these changes Nov 8, 2022

View reviewed changes

Final commit as per PR

1d18760

CharlesRendle added 4 commits November 9, 2022 10:07

Merge remote-tracking branch 'origin/main' into CharlesRendle/issue563

305b6c3

unsafe_hash=True for qbmeasure + unit

05c73a0

Removes colons from path and file names.

68a0894

Changes the colon in expected test file name too

259bd85

CharlesRendle merged commit 0d777fc into main Nov 9, 2022

CharlesRendle deleted the CharlesRendle/issue563 branch November 9, 2022 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Charles rendle/issue563 #622

Charles rendle/issue563 #622

CharlesRendle commented Oct 17, 2022

GDonRanasinghe left a comment

GDonRanasinghe Nov 3, 2022

CharlesRendle Nov 7, 2022

GDonRanasinghe Nov 3, 2022

CharlesRendle Nov 7, 2022

GDonRanasinghe Nov 3, 2022

CharlesRendle Nov 7, 2022

robons commented Nov 8, 2022

robons left a comment

robons Nov 8, 2022

robons Nov 8, 2022

robons Nov 8, 2022

robons Nov 8, 2022

robons Nov 8, 2022

robons commented Nov 8, 2022

github-actions bot commented Nov 9, 2022 •

edited

Loading

github-actions bot commented Nov 9, 2022 •

edited

Loading

github-actions bot commented Nov 9, 2022 •

edited

Loading

sonarcloud bot commented Nov 9, 2022

		assert (tmp_dir / "stress.csv").exists()


		def test_generated_csv_shape_and_num_unique_values():

Charles rendle/issue563 #622

Charles rendle/issue563 #622

Conversation

CharlesRendle commented Oct 17, 2022

GDonRanasinghe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robons commented Nov 8, 2022

robons left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robons commented Nov 8, 2022

github-actions bot commented Nov 9, 2022 • edited Loading

ubuntu-latest-python3.9 test results

github-actions bot commented Nov 9, 2022 • edited Loading

ubuntu-latest-python3.10 test results

github-actions bot commented Nov 9, 2022 • edited Loading

windows-latest-python3.10 test results

sonarcloud bot commented Nov 9, 2022

github-actions bot commented Nov 9, 2022 •

edited

Loading

github-actions bot commented Nov 9, 2022 •

edited

Loading

github-actions bot commented Nov 9, 2022 •

edited

Loading