Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[refactor] Generalization of dual_gemm_silu_identity_mul #1141

Merged
merged 1 commit into from
Nov 5, 2024

Conversation

warpuv
Copy link
Contributor

@warpuv warpuv commented Nov 2, 2024

Generalization of dual_gemm_silu_identity_mul to use custom activation function.

  • renamed dual_gemm_silu_identity_mul* to dual_gemm_lhs_activation_and_mul*
  • added a template parameter for the activation function to dual_gemm_lhs_activation_and_mul*.
  • added customizable epilogue class (EpilogueLHSActivationAndMul) with the same template parameter.

What does this PR do?

Fixes #1140

Before submitting

  • Did you have fun?
    • Make sure you had fun coding 🙃
  • Did you read the contributor guideline?
  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
    • N/A
  • Did you make sure to update the docs?
    • N/A
  • Did you write any new necessary tests?
    • N/A
  • Did you update the changelog? (if needed)
    • N/A

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

… any activation function.

renamed dual_gemm_silu_identity_mul* to dual_gemm_lhs_activation_and_mul*,
added a template parameter for the activation function to it.
added customizable epilogue class (EpilogueLHSActivationAndMul)
with the same template parameter.
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 2, 2024
@pansershrek
Copy link

pansershrek commented Nov 5, 2024

@danthe3rd, @lw, @zyan0 Can you review this PR please? This update will open pass to implement fused GELUTanh for Gemma models

@danthe3rd
Copy link
Contributor

Hi @pansershrek and @warpuv
Thanks for opening this PR. In principle I'm happy to accept this line of contributions, however you should be aware that:

  • this implementation is efficient only on Ampere GPUs (it will be super slow on Hopper)
  • moving forward, the easiest way to do this would be to interleave B0 and B1 columns, so that the output of the GEMM is also interleaved, and the activation can still be fused in the epilogue in the same way (but the rest of the GEMM is untouched)

cc @tridao

@danthe3rd danthe3rd merged commit 3296c2b into facebookresearch:main Nov 5, 2024
8 of 9 checks passed
@pansershrek
Copy link

Hi @danthe3rd !
Thank you for your replay, can you explain in more details your advice about B0 and B1 columns? We don't understand the difference in architecture that well as you :) .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generalization of dual_gemm_silu_identity_mul to use custom activation function, not only SiLU
4 participants