-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Round robin during spread scheduling #19968
Conversation
Nice change! |
pipelined_ingestion_1500_gb_15_windows succeeded without prefix hack: https://buildkite.com/ray-project/periodic-ci/builds/1902#_ |
How's the before/after time? And time without spread hint?
…On Thu, Dec 9, 2021, 7:18 PM Jiajun Yao ***@***.***> wrote:
pipelined_ingestion_1500_gb_15_windows succeeded without prefix hack:
https://buildkite.com/ray-project/periodic-ci/builds/1902#_
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#19968 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADUSXJ54ZSURT5E2ATOX3UQFWQZANCNFSM5HFOSUIQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Without prefix, without spread scheduling strategy: Forever (hang) @scv119 can confirm this. |
Any idea why it's still 50% slower? Btw, .stats() can show you the node
distribution though it might not be complete for shuffle.
…On Thu, Dec 9, 2021, 7:46 PM Jiajun Yao ***@***.***> wrote:
Without prefix, without spread scheduling strategy: Forever (hang)
With prefix: ~33m
With spread scheduling strategy: ~47m
@scv119 <https://github.com/scv119> can confirm this.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#19968 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADUSUBSTQ3FNQQHOCPNZ3UQFZYTANCNFSM5HFOSUIQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I'm trying to figure out the difference now. This is a good validation use case for dataset stats and scheduler observability projects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for cpp part. I will wait for Eric's approval since I didn't review the dataset part.
Only one nit comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Datasets side looks good, a few comments about the Ray side.
Not sure if there is anything changed or last run was just noise: I ran |
It's ready to be merged. Perf number looks good: https://buildkite.com/ray-project/periodic-ci/builds/2086#_ |
Oh, I think for latests runs, I ran the tests against master which still had the prefix code in |
This reverts commit 60388b2.
Why are these changes needed?
This PR does round robin over all the nodes instead of always starting from the beginning of the node list for spread scheduling.
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.