Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PERF] Update number of cores on every iteration #1480

Merged
merged 4 commits into from
Oct 10, 2023

Conversation

jaychia
Copy link
Contributor

@jaychia jaychia commented Oct 9, 2023

Updates the number of cores available before/after every batch dispatch

This should allow us to take advantage of autoscaling of the Ray cluster better as we will schedule larger batches of tasks + more total inflight tasks as the cluster autoscales.

Copy link
Contributor

@xcharleslin xcharleslin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

# This call takes about 0.3ms and hits a locally in-memory cached record of cluster resources
cores: int = int(ray.cluster_resources()["CPU"]) - self.reserved_cores
max_inflight_tasks = cores + self.max_task_backlog

while True: # Loop: Dispatch (get tasks -> batch dispatch).
tasks_to_dispatch: list[PartitionTask] = []

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might even want to do it here, this is where batches are dispatched up to the limit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github UI is being unclear, but I mean the last line of that block, 458/456

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved into inner loop and guarded it with a TTL

@codecov
Copy link

codecov bot commented Oct 10, 2023

Codecov Report

Merging #1480 (0b63fb7) into main (553a911) will increase coverage by 0.15%.
Report is 2 commits behind head on main.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1480      +/-   ##
==========================================
+ Coverage   74.70%   74.86%   +0.15%     
==========================================
  Files          60       60              
  Lines        6061     6102      +41     
==========================================
+ Hits         4528     4568      +40     
- Misses       1533     1534       +1     
Files Coverage Δ
daft/runners/ray_runner.py 91.48% <100.00%> (+0.28%) ⬆️

... and 2 files with indirect coverage changes

@samster25 samster25 merged commit 439f2bd into main Oct 10, 2023
24 checks passed
@samster25 samster25 deleted the jay/update-cores-scheduler branch October 10, 2023 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants