Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Implement Flash attention. #7141

Closed
qnixsynapse opened this issue May 8, 2024 · 5 comments
Closed

[SYCL] Implement Flash attention. #7141

qnixsynapse opened this issue May 8, 2024 · 5 comments
Labels
enhancement New feature or request stale

Comments

@qnixsynapse
Copy link
Contributor

Currently Flash attention is available in CUDA and Metal backends in #5021.

From the paper: Flash attention is an IO-aware exact attention algorithm that uses tiling to reduce the number of memory reads/writes between GPU high bandwidth memory (HBM) and GPU on-chip SRAM. [...] it requires fewer HBM accesses than standard attention, and is optimal for a range of SRAM sizes. [..]

Thing is whether dedicated Intel GPUs can benefit from it or not and it will be interesting to see how much the performance improves.

@qnixsynapse qnixsynapse added the enhancement New feature or request label May 8, 2024
@NeoZhangJianyu
Copy link
Collaborator

@qnixsynapse
Yes, Intel GPU will get benefit from it.
It has been verified in other AI framework.
We could consider to implement it in SYCL backend next.

@qnixsynapse
Copy link
Contributor Author

@NeoZhangJianyu Nice. Thank you!

@github-actions github-actions bot added the stale label Jun 9, 2024
@qnixsynapse
Copy link
Contributor Author

This issue has been tagged "stale" label. I am studying SYCL and C++ currently and waiting for major SYCL refactoring so that the code is readable and it will be easier for me to (eventually) implement flash attn kernel if needed.

Commented here to make this issue active again.

@slaren slaren removed the stale label Jun 16, 2024
@github-actions github-actions bot added the stale label Jul 18, 2024
Copy link
Contributor

github-actions bot commented Aug 1, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Aug 1, 2024
@piDack
Copy link
Contributor

piDack commented Sep 16, 2024

Any progress?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

4 participants