Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DontReview Garnitin/add gke load testing/v1 #2225

Draft
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

gargnitingoogle
Copy link
Collaborator

Description

Link to the issue in case of a bug fix.

NA

Testing details

  1. Manual - NA
  2. Unit tests - NA
  3. Integration tests - NA

Copy link

codecov bot commented Jul 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.36%. Comparing base (3b6804a) to head (01c1a06).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2225      +/-   ##
==========================================
- Coverage   78.38%   78.36%   -0.03%     
==========================================
  Files         107      107              
  Lines       11793    11793              
==========================================
- Hits         9244     9241       -3     
- Misses       2058     2060       +2     
- Partials      491      492       +1     
Flag Coverage Δ
unittests 78.36% <ø> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fi

function validateMachineConfig() {
echo "Validiting input parameters ..."
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

local cluster_name=${1}
local zone=${2}
local node_pool=${3}
if [ $(gcloud container node-pools list --cluster=${cluster_name} --zone=${zone} | grep -ow ${node_pool} | wc -l) -gt 0 ] ; then
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify this with grep -q

if ClusterExists ${cluster_name} ; then
gcloud container clusters update ${cluster_name} --location=${zone} --workload-pool=${project_id}.svc.id.goog
else
# gcloud container --project "${project_id}" clusters create ${cluster_name} --zone "${zone}" --cluster-version "${cluster_version}" --workload-pool=${project_id}.svc.id.goog
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented code

echo "Enabling/disabling csi add-on ..."
# By default, disable the managed csi driver.
if ${useCustomCsiDriver}; then
# gcloud -q container clusters update ${cluster_name} \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-enable this

gcloud container clusters get-credentials ${cluster_name} --location=${zone}
kubectl create namespace ${appnamespace}
kubectl create serviceaccount ${ksa} --namespace ${appnamespace}
for workload_bucket in ${buckets} ; do
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add code to get buckets from somewhere.

}

# validateMachineConfig ${machine_type} ${num_nodes} ${num_ssd}
# installDependencies
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-enable all these disabled steps.

def get_cpu(pod_name: str, start: str, end: str) -> Tuple[float, float]:
# for some reason, the mash filter does not always work, so we fetch all the metrics for all the pods and filter later.
result = subprocess.run(["mash", "--namespace=cloud_prod", "--output=csv",
f"Query(Fetch(Raw('cloud.kubernetes.K8sContainer', 'kubernetes.io/container/cpu/core_usage_time'), {{'project': '927584127901'}})| Window(Rate('10m'))| GroupBy(['pod_name', 'container_name'], Max()), TimeInterval('{start}', '{end}'), '5s')"],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put the project-number back to the original and change it by code during runtime and then revert it back when done during runtime.

function updateMachineTypeInPodConfigs() {
for file in ${gke_testing_dir}/examples/fio/loading-test/values.yaml ${gke_testing_dir}/examples/dlio/unet3d-loading-test/values.yaml ; do
test -f ${file}
sed -i -E "s/nodeType: [0-9a-z_-]+$/nodeType: ${machine_type}/g" ${file}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add code to revert this back

for file in ${gke_testing_dir}/examples/fio/loading-test/values.yaml ${gke_testing_dir}/examples/dlio/unet3d-loading-test/values.yaml ; do
test -f ${file}
# sed -i -E "s/mountOptions: [0-9a-zA-Z,\:\"-_]+$/mountOptions: \"${gcsfuse_mount_options}\"/g" ${file}
sed -i -E "s/mountOptions:[ \t]*\"?[a-zA-Z0-9,:_-]+\"? *$/mountOptions: \"${gcsfuse_mount_options}\"/g" "${file}"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add code to revert this back when done

@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch from 2958a3c to ad97f82 Compare July 29, 2024 10:47
return utc_timestamp_string

def standard_timestamp(timestamp: int) -> str:
return timestamp.split('.')[0].replace('T', ' ') + " UTC"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

insert newline after this line

@@ -0,0 +1,425 @@
#!/bin/bash

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a header and a help option at the top.

@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 7 times, most recently from ff7ee55 to 478921c Compare August 6, 2024 09:53
@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 9 times, most recently from 1e4353a to b6a0e76 Compare August 16, 2024 14:45
@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 9 times, most recently from b62b517 to d008a84 Compare August 22, 2024 11:53
Option: force_update_gcsfuse_code
Default: false

This applies when $gcsfuse_src_dir has been passed,
 or when $src_dir/gcsfuse already exists.
Adds logs for missing sudoless-docker,
helpful instruction to install it,
and then to re-run the script.
This reverts commit 6fa2ad2.

Reverting this because the above commit exposed PII
in form of the gcsfuse-internal project
IDs and numbers.
The run-gke-tests script should fail
if exactly one out of project_id
and project_number has been set.
It will error out in such as a case.
Add 'Error: ' prefix in all
the error logs for easy spotting
in log files.
Utilities:
1. Append given tabular data to the given gsheet
   id and worksheet name.
2. Return url for a gsheet given its ID.
3. Adds unit tests for the above utilities.
@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 2 times, most recently from 01c1a06 to d48baa9 Compare October 7, 2024 10:26
add generic utility to append to a gsheet

Add utility for gsheet

improve gsheet utility

export fio output to gsheet

encapsulate cpu/memory calculation in fio

disable repeat operations for quick testing

add dlio output export to gsheet

fix a bug in dlio output parsing

fix a column-name in fio csv output

Revert "disable repeat operations for quick testing"

This reverts commit 04bf834.

add log of successful addition to gsheet

clean-up code changes

added some error-checking

wip

wip

fix calls to download_gcs_objects

support key-file on gcs in gsheet

put back cpu/memory metrics

fix couple of logs

fix couple of help messages

put back accidentally deleted command
Purposes.
* Consistent behavior across all machines
* Monitoring API has faster runtime than mash.
* Monitoring API is supported on GCE VM too.
Adds row with "ERROR" for all values
rather than crashing during printing in
CSV file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant