Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* add support for attn mask * add mask operation * add mask operation * add mask operation * add interface * add mask support * add mask supprt * fix up * add bias * add template * add test * clean * clean code * add mask load * add mask test * fix forward bugs * add test * add mask in backward * add test case * add bias * add mask * add bias test * fix test case * add without mask test * add kernel test * add ds save * fix interface * add test * fix dbias * add bias support * add mask shape * add test * add support * fix bf16 and mask shape * fix mask head=1 shape * add dump * to fix len 512 * add test * fix seqlen greater than 256 * fix bias seqlen * add constexpr * add const expr for bwd * add benchmark * add test tools * add script * add cross attention * add cross attn * fix bugs * remove test tools * clean fmha_api.cpp * clean fmha_dgrad_fp16_kernel_loop.sm80.cu * clean fmha_dgrad_kernel_1xN_loop.h * clean fmha_fprop_fp16_kernel.sm80.cu * clean fmha_fprop_kernel_1xN.h * cleangmem_tile.h * clean softmax.h * restore test_flash_attn.py * clean gmem_tile.h * fix fmha_fprop_kernel_1xN.h * fix fmha_dgrad_kernel_1xN_loop.h * rename has_attn to has_attn_mask, has_bias to has_attn_bias * fix fmha_fprop_kernel_1xN.h * rename has_attn to has_attn_mask, has_bias to has_attn_bias * remove useless benchmark code * add declaration * remove useless comments * remove useless comments * add timeout * add default timeout for build wheel * remove timeout * reduce build worker for workflow oom
- Loading branch information