Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Benchmarking CI #420

Merged
merged 8 commits into from
Sep 26, 2024
Merged

Add Benchmarking CI #420

merged 8 commits into from
Sep 26, 2024

Conversation

christiangnrd
Copy link
Contributor

@christiangnrd christiangnrd commented Sep 18, 2024

Using a JuliaGPU branch instead of from my fork. See #419 for the start of this branch.

Original text:
Copying over the benchmarks from CUDA.jl.

I'm not sure if I converted them properly. Some function seem to be missing Metal implementations (like reverse).

The final (and biggest) problem is how inconsistent these results have been. Simply rerunning the benchmarks on the same code gives some huge performance differences. There is always at least one benchmark that is >20% slower or faster. The benchmarks seem to be way more consistent on the runners. Only a few seem to have big variance.

See #418 (comment)

Todo:

  • Implement actual CI benchmark solution from GemmKernels.jl
  • Ensure Benchmarks are correct

@christiangnrd
Copy link
Contributor Author

christiangnrd commented Sep 18, 2024

The culprit was whitespace... This is (one of the many reasons) why Julia > Python.

@christiangnrd christiangnrd force-pushed the cg/benchmark branch 2 times, most recently from f8c7b21 to 8b19a91 Compare September 18, 2024 16:38
@christiangnrd
Copy link
Contributor Author

@maleadt Do you know how I can set up the buildkite token on the Github Actions side?

@maleadt
Copy link
Member

maleadt commented Sep 18, 2024

I can add a token. Which permissions do you need?

@maleadt
Copy link
Member

maleadt commented Sep 18, 2024

Found https://github.com/actions-marketplace-validations/EnricoMi_download-buildkite-artifact-action?tab=readme-ov-file#configuration; I've added a token

@christiangnrd christiangnrd force-pushed the cg/benchmark branch 3 times, most recently from 721ef62 to b40d7f1 Compare September 19, 2024 12:41
@christiangnrd
Copy link
Contributor Author

christiangnrd commented Sep 19, 2024

Do we want to save the minimum, the median, or the mean?

For testing now I'll do median since that's what https://github.com/LuxDL/LuxLib.jl/pull/128/files is doing.

@christiangnrd
Copy link
Contributor Author

christiangnrd commented Sep 19, 2024

This is mostly ready. I assume benchmarks will be posted to new PRs after this is merged to master?

A few uncertainties:

  • Which summary statistic do we want to report?
  • Most benchmarks are working, but I'd like someone with a better understanding of lower-level gpu programming to make sure that I converted them from CUDA to Metal properly.
  • volumerhs isn't running, can/should that be fixed or is it not relevant for Metal?
  • Do we want to run the benchmarks even if CI fails like in CUDA.jl or should I get rid of those lines?

@christiangnrd christiangnrd marked this pull request as ready for review September 19, 2024 13:36
@christiangnrd christiangnrd marked this pull request as draft September 19, 2024 13:40
@christiangnrd christiangnrd marked this pull request as ready for review September 19, 2024 14:59
@christiangnrd christiangnrd mentioned this pull request Sep 21, 2024
4 tasks
Copy link
Member

@maleadt maleadt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'll only see this in action on subsequent PRs?

queue: "juliaecosystem"
os: "macos"
arch: "aarch64"
macos_version: "15.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need macOS 15 for the benchmarks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially wanted to see the performance impact of logging. Seems like now it's only enabled when actually used so might not be worth it.

@christiangnrd
Copy link
Contributor Author

I guess we'll only see this in action on subsequent PRs?

Looking at the Lux PR I based this off of it seems like it.

@maleadt maleadt merged commit 8652754 into main Sep 26, 2024
2 checks passed
@maleadt maleadt deleted the cg/benchmark branch September 26, 2024 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants