Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust throughput grid profiling #74

Merged
merged 4 commits into from
Jan 17, 2022
Merged

Conversation

parasj
Copy link
Contributor

@parasj parasj commented Jan 17, 2022

More robust throughput profiling tool that allows resuming from an older profile and is more error tolerant (logs errors as failed trials rather than exiting)

Sample file output: https://gist.github.com/parasj/efc883ef7180fa2e92b008eb26c6aa89#file-example-csv

$ skylark experiments throughput-grid -aws us-east-1 -aws us-west-1 -gcp us-east1-b --no-enable-azure
           provision:61  | Provisioning AWS instances in ('us-east-1', 'us-west-1')
add IP to aws security groups (us-west-1): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00,  1.30it/s]
           provision:97  | Provisioning GCP instances in ('us-east1-b',)
Provisioning init:   0%|                                                                                                                                                                                                | 0/3 [00:00<?, ?it/s]      wait_for_ready:147 | Waiting for aws:us-east-1:i-04616e5abdee019e9 to be ready
      wait_for_ready:147 | Waiting for aws:us-west-1:i-0cf6c85dcdacaf106 to be ready
      wait_for_ready:147 | Waiting for skylark-333700:gcp:us-east1-b:skylark-gcp-d2b892ee683941279bae3a292a4ddd93 to be ready
      wait_for_ready:149 | skylark-333700:gcp:us-east1-b:skylark-gcp-d2b892ee683941279bae3a292a4ddd93 is ready
Provisioning init (gcp:us-east1-b):   0%|                                                                                                                                                                               | 0/3 [00:00<?, ?it/s]      wait_for_ready:149 | aws:us-east-1:i-04616e5abdee019e9 is ready
Provisioning init (aws:us-east-1):  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                        | 2/3 [00:00<00:00,  5.59it/s]      wait_for_ready:149 | aws:us-west-1:i-0cf6c85dcdacaf106 is ready
Provisioning init (aws:us-west-1): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00,  4.63it/s]
Setup (aws:us-west-1): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00,  1.15it/s]

Experiment configuration: (total pairs = 6)
	Group 0: (1 items)
	gcp:us-east1-b -> aws:us-east-1
	Group 1: (1 items)
	aws:us-east-1 -> aws:us-west-1
	Group 2: (1 items)
	aws:us-west-1 -> aws:us-east-1
	Group 3: (1 items)
	aws:us-west-1 -> gcp:us-east1-b
	Group 4: (1 items)
	gcp:us-east1-b -> aws:us-west-1
	Group 5: (1 items)
	aws:us-east-1 -> gcp:us-east1-b

iperf_runtime=4, iperf3_connections=64
Approximate runtime: 84s (assuming 10s startup time)
Approximate data to send: 15.00GB (assuming 5Gbps)
Approximate cost: $1.50 (assuming $0.10/GB)
? Launch experiment 2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c? Yes
Experiment tag: 2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c
Log directory: /home/ubuntu/skylark/data/logs/throughput_grid/2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c
Raw iperf3 log directory: /home/ubuntu/skylark/data/logs/throughput_grid/2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c/raw_iperf3_logs
gcp:us-east1-b:PREMIUM_aws:us-east-1:PREMIUM: 5.87Gbps
Parallel eval group 0 (gcp:us-east1-b:PREMIUM to aws:us-east-1:PREMIUM): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:06<00:00,  6.09s/it]
Saving intermediate results to /home/ubuntu/skylark/data/logs/throughput_grid/2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c/throughput.csv
aws:us-east-1:PREMIUM_aws:us-west-1:PREMIUM: 4.38Gbps
Parallel eval group 1 (aws:us-east-1:PREMIUM to aws:us-west-1:PREMIUM): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00,  8.54s/it]
Saving intermediate results to /home/ubuntu/skylark/data/logs/throughput_grid/2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c/throughput.csv
aws:us-west-1:PREMIUM_aws:us-east-1:PREMIUM: 4.38Gbps
Parallel eval group 2 (aws:us-west-1:PREMIUM to aws:us-east-1:PREMIUM): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00,  8.89s/it]
Saving intermediate results to /home/ubuntu/skylark/data/logs/throughput_grid/2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c/throughput.csv
aws:us-west-1:PREMIUM_gcp:us-east1-b:PREMIUM: 4.30Gbps
Parallel eval group 3 (aws:us-west-1:PREMIUM to gcp:us-east1-b:PREMIUM): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:09<00:00,  9.61s/it]
Saving intermediate results to /home/ubuntu/skylark/data/logs/throughput_grid/2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c/throughput.csv
gcp:us-east1-b:PREMIUM_aws:us-west-1:PREMIUM: 5.33Gbps
Parallel eval group 4 (gcp:us-east1-b:PREMIUM to aws:us-west-1:PREMIUM): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:09<00:00,  9.34s/it]
Saving intermediate results to /home/ubuntu/skylark/data/logs/throughput_grid/2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c/throughput.csv
aws:us-east-1:PREMIUM_gcp:us-east1-b:PREMIUM: 4.43Gbps
Parallel eval group 5 (aws:us-east-1:PREMIUM to gcp:us-east1-b:PREMIUM): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00,  5.99s/it]
Saving intermediate results to /home/ubuntu/skylark/data/logs/throughput_grid/2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c/throughput.csv
Total throughput evaluation: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:48<00:00,  8.08s/it]
Experiment complete: 2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c
Results saved to /home/ubuntu/skylark/data/logs/throughput_grid/2022.01.17_05.06_spiny-dock-rogue-sari_4s_64c/throughput.csv

@parasj parasj self-assigned this Jan 17, 2022
@parasj parasj requested a review from samkumar January 17, 2022 05:10
commit a6e705b
Author: Paras Jain <[email protected]>
Date:   Mon Jan 17 05:18:17 2022 +0000

    write after each group, not at the end

commit 75624c7
Author: Paras Jain <[email protected]>
Date:   Mon Jan 17 05:08:55 2022 +0000

    Azure cleanup script to quickly deprovision resource groups

commit eefc266
Author: Paras Jain <[email protected]>
Date:   Mon Jan 17 04:56:58 2022 +0000

    Resume throughput from intermediate result

commit a5d6a4d
Author: Paras Jain <[email protected]>
Date:   Mon Jan 17 02:03:14 2022 +0000

    New throughput tool
@parasj parasj force-pushed the dev/paras/mini_throughput_grid branch from a6e705b to 5a23507 Compare January 17, 2022 05:35
Copy link
Contributor

@samkumar samkumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall.

@@ -0,0 +1,5 @@
#!/bin/bash
set -xe
az group list | jq -c ".[].name" | xargs -L 1 echo | parallel --progress -j0 az group delete --yes --no-wait --name {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this deletes all resource groups, not just the skylark ones, right? You may want to add a warning about this or something...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added a prefix argument along with confirmation.

skylark/cli/experiments/throughput.py Show resolved Hide resolved
@parasj parasj merged commit caac38a into main Jan 17, 2022
@parasj parasj deleted the dev/paras/mini_throughput_grid branch January 17, 2022 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants