[onert] Optimize BatchMatMul kernel in cpu backend #12140

ragmani · 2023-11-28T10:48:05Z

Let's optimize BatchMatMul kernel in cpu backend
Currently BatchMatMul kernel in cpu backend is not optimized yet.

zetwhite · 2023-12-01T03:13:36Z

I'm interested in this task.
But I couldn't catch the details.
Could you share any background context?

ragmani · 2023-12-01T03:19:09Z

Could you share any background context?

The background is to support nmt model now. For now, though, I think this task will be useful in the future.

chunseoklee · 2023-12-01T08:27:31Z

Could you share any background context?

There is a long story about detailed context. In short, this is a part of work to support transformer-based model (in this case, Machine Translation) For reference, please see link

glistening · 2024-11-13T00:42:22Z

@tomdol continuation from #14305 (comment)

We need optimized batch matmul for arm32.

For transposed batch matmul, I confirmed it works and is enough using ggml_mulmat.

For normal batch matmul like torch.bmm, which has b × n × m and b × m × p tensors for inputs, we need optimized kernel.

You may bring from other open source kernels (xnnpack, kleidiai, ...) or write by yourself. ( I am not sure that these has the optimized kernel what we need.)

We have an alternative option (e.g. insert transpose before normat batch matmul, and use ggml_mulmat). Thus, if you think that this task takes much time or seems not to be better than alternative approach, you don't need to take this task.

ragmani added help wanted Extra attention is needed area/onert ONE runtime labels Nov 28, 2023

hseok-oh added this to [ONE] onert and [ONE] onert - LLM support Aug 23, 2024

github-project-automation bot moved this to Todo in [ONE] onert Aug 23, 2024

hseok-oh removed this from [ONE] onert Aug 27, 2024

hseok-oh moved this to Ready to Start in [ONE] onert - LLM support Aug 27, 2024

jiwaszki mentioned this issue Sep 3, 2024

[WIP][DRAFT][onert] Optimized BatchMatMul in CPU backend #13919

Draft

jiwaszki self-assigned this Sep 3, 2024

tomdol assigned tomdol and unassigned jiwaszki Oct 4, 2024

tomdol mentioned this issue Oct 18, 2024

[compute/cker] Optimize BMM for X86 #14238

Draft

tomdol mentioned this issue Nov 5, 2024

[compute/cker] Optimize BatchMatMul for x86 #14305

Merged

glistening self-assigned this Nov 14, 2024

glistening moved this from Ready to Start to In Progress in [ONE] onert - LLM support Nov 14, 2024

glistening removed their assignment Nov 14, 2024

glistening moved this from In Progress to Ready to Start in [ONE] onert - LLM support Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[onert] Optimize BatchMatMul kernel in cpu backend #12140

[onert] Optimize BatchMatMul kernel in cpu backend #12140

ragmani commented Nov 28, 2023

zetwhite commented Dec 1, 2023

ragmani commented Dec 1, 2023 •

edited

Loading

chunseoklee commented Dec 1, 2023

glistening commented Nov 13, 2024 •

edited

Loading

[onert] Optimize BatchMatMul kernel in cpu backend #12140

[onert] Optimize BatchMatMul kernel in cpu backend #12140

Comments

ragmani commented Nov 28, 2023

zetwhite commented Dec 1, 2023

ragmani commented Dec 1, 2023 • edited Loading

chunseoklee commented Dec 1, 2023

glistening commented Nov 13, 2024 • edited Loading

ragmani commented Dec 1, 2023 •

edited

Loading

glistening commented Nov 13, 2024 •

edited

Loading