Add `XeTLA` FA backward implementation to benchmark #2367

ESI-SYD · 2024-09-27T01:56:27Z

Implementation comes from IPEX

https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11089057652

Fix

benchmarks/xetla_kernel/python_main.cpp

whitneywhtsang

Can we run it on CI and check if the result can be compared with torch?

Co-authored-by: Anatoly Myachev <[email protected]>

ESI-SYD · 2024-10-08T08:25:11Z

Can we run it on CI and check if the result can be compared with torch?

CI tutorial seems not ready for backward (include torch's implication) yet.

ZzEeKkAa · 2024-10-08T14:49:49Z

benchmarks/xetla_kernel/CMakeLists.txt

-set(XETLA_KERNEL_FLAGS ${XETLA_KERNEL_FLAGS} -fsycl)
+set(XETLA_KERNEL_FLAGS ${XETLA_KERNEL_FLAGS}
+  -fsycl
+  -fsycl-device-code-split=per_kernel


What does it change?

This change adds additional flag to perform sycl kernel splitting and helps to resolve RuntimeError below.

No perf regression for xetla in my local env.

RuntimeError: The program was built for 1 devices Build program log for 'Intel(R) Data Center GPU Max 1100': -11 (PI_ERROR_BUILD_PROGRAM_FAILURE)

Init

872c565

Fix

ESI-SYD linked an issue Sep 27, 2024 that may be closed by this pull request

Adding XeTLA FA implementation for all variants mentioned in the FA paper #2179

Closed

ESI-SYD self-assigned this Sep 27, 2024

ESI-SYD requested review from anmyachev, whitneywhtsang, etiotto and chengjunlu September 27, 2024 02:16

chengjunlu approved these changes Sep 27, 2024

View reviewed changes

ESI-SYD marked this pull request as draft September 27, 2024 02:23

Update

1b1064c

ESI-SYD force-pushed the yudong/fa_bwd_xetla branch from d38497e to 1b1064c Compare September 27, 2024 09:33

ESI-SYD marked this pull request as ready for review September 29, 2024 01:08

anmyachev reviewed Sep 29, 2024

View reviewed changes

benchmarks/xetla_kernel/python_main.cpp Outdated Show resolved Hide resolved

etiotto requested a review from ZzEeKkAa October 7, 2024 14:40

whitneywhtsang reviewed Oct 7, 2024

View reviewed changes

ESI-SYD and others added 2 commits October 8, 2024 11:46

Update benchmarks/xetla_kernel/python_main.cpp

a32fe61

Co-authored-by: Anatoly Myachev <[email protected]>

Merge branch 'main' into yudong/fa_bwd_xetla

8027c99

chengjunlu approved these changes Oct 8, 2024

View reviewed changes

ZzEeKkAa reviewed Oct 8, 2024

View reviewed changes

etiotto requested review from ZzEeKkAa, whitneywhtsang and anmyachev October 10, 2024 17:38

ESI-SYD merged commit fe45283 into main Oct 14, 2024
5 checks passed

ESI-SYD deleted the yudong/fa_bwd_xetla branch October 14, 2024 01:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `XeTLA` FA backward implementation to benchmark #2367

Add `XeTLA` FA backward implementation to benchmark #2367

ESI-SYD commented Sep 27, 2024 •

edited

Loading

whitneywhtsang left a comment •

edited

Loading

ESI-SYD commented Oct 8, 2024

ZzEeKkAa Oct 8, 2024

ESI-SYD Oct 9, 2024 •

edited by vlad-penkin

Loading

Add XeTLA FA backward implementation to benchmark #2367

Add XeTLA FA backward implementation to benchmark #2367

Conversation

ESI-SYD commented Sep 27, 2024 • edited Loading

whitneywhtsang left a comment • edited Loading

Choose a reason for hiding this comment

ESI-SYD commented Oct 8, 2024

ZzEeKkAa Oct 8, 2024

Choose a reason for hiding this comment

ESI-SYD Oct 9, 2024 • edited by vlad-penkin Loading

Choose a reason for hiding this comment

Add `XeTLA` FA backward implementation to benchmark #2367

Add `XeTLA` FA backward implementation to benchmark #2367

ESI-SYD commented Sep 27, 2024 •

edited

Loading

whitneywhtsang left a comment •

edited

Loading

ESI-SYD Oct 9, 2024 •

edited by vlad-penkin

Loading