[DNM] Try two worker processes per host #915

ntabris · 2023-07-25T18:35:43Z

I'd like to see how this impacts perf of various benchmarks.

fjetter · 2023-07-26T10:18:45Z

cluster_kwargs.yaml

@@ -15,6 +15,7 @@ default:
  package_sync: true
  wait_for_workers: true
  scheduler_vm_types: [m6i.large]
+  _n_worker_specs_per_host: 2


is this on prod now?

IIUC a fair assessment would require us to double the machine sizes and reduce cluster sizes by half. Otherwise the same-host workers have much less memory to work with and are much more likely to run into spilling and OOM

I kicked off an A/B test with this https://github.com/coiled/benchmarks/actions/runs/5667683049

Results of this A/B test are interesting but likely require a bit further analysis

We can see a couple of tests that regress significantly while others are getting much worse.

My best read without digging deeper is that most/almost all of our tests require a bit of spilling and that all test cases that are spilling behave really poorly, likely because disk is just busy?

ntabris · 2023-07-26T14:57:59Z

@fjetter you want to try run with 1/2 the number of m6i.xlarge machines? want me to push that change to this branch?

fjetter · 2023-07-26T16:32:53Z

@ntabris #917 is what I ran for the A/B tests. Clusters were using only half the number of VMs, VMs were twice as large, i.e. same total number of CPUs, same total RAM, half the number of VMs with two workers each

Try two worker processes per host

bd266c2

fjetter reviewed Jul 26, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DNM] Try two worker processes per host #915

[DNM] Try two worker processes per host #915

ntabris commented Jul 25, 2023

fjetter Jul 26, 2023

fjetter Jul 26, 2023

fjetter Jul 26, 2023

fjetter Jul 26, 2023

ntabris commented Jul 26, 2023

fjetter commented Jul 26, 2023

[DNM] Try two worker processes per host #915

Are you sure you want to change the base?

[DNM] Try two worker processes per host #915

Conversation

ntabris commented Jul 25, 2023

fjetter Jul 26, 2023

Choose a reason for hiding this comment

fjetter Jul 26, 2023

Choose a reason for hiding this comment

fjetter Jul 26, 2023

Choose a reason for hiding this comment

fjetter Jul 26, 2023

Choose a reason for hiding this comment

ntabris commented Jul 26, 2023

fjetter commented Jul 26, 2023