Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Toolchain support for multiple memories #45

Open
titzer opened this issue May 3, 2023 · 31 comments
Open

Toolchain support for multiple memories #45

titzer opened this issue May 3, 2023 · 31 comments

Comments

@titzer
Copy link
Contributor

titzer commented May 3, 2023

Does anyone know the current status of multi-memory support in toolchains, e.g. LLVM? After a cursory search of LLVM commits, I didn't turn up anything.

@dschuff
Copy link
Member

dschuff commented May 3, 2023

Support was recently added to Binaryen, but hasn't been added to LLVM yet.

@chenzhuofu
Copy link

chenzhuofu commented May 7, 2023

I'm curious if there is any example of "high level language being compiled to WebAssembly with multi-memory support"?
I searched for a long time but still haven't found any. :(

@penzn
Copy link
Contributor

penzn commented May 8, 2023

(sorry, misread the above comment)

Support was recently added to Binaryen, but hasn't been added to LLVM yet.

Is anybody working on LLVM support? LLVM IR has address spaces, that can be used to represent multiple memories.

@tlively
Copy link
Member

tlively commented May 8, 2023

No, nobody is currently working on multimemory in LLVM, although Igalia's work adding support for tables is very similar to what would need to happen to support multiple memories as well.

@dschuff
Copy link
Member

dschuff commented May 8, 2023

I'm curious if there is any example of "high level language being compiled to WebAssembly with multi-memory support"? I searched for a long time but still haven't found any. :(

I don't know of any either, probably because of the current lack of implementations. Hopefully we will soon break the chicken-and-egg problem. Did you have a particular thing you wanted to learn from an example?

@chenzhuofu
Copy link

I'm curious if there is any example of "high level language being compiled to WebAssembly with multi-memory support"? I searched for a long time but still haven't found any. :(

I don't know of any either, probably because of the current lack of implementations. Hopefully we will soon break the chicken-and-egg problem. Did you have a particular thing you wanted to learn from an example?

I have come up an technique leveraging the multi-memory support of the WebAssembly and now need some test cases.
So I got a high-level coding application and wanted to rewrite it for working on multi-memory, then compile it to wasm.

And such high-level coding method is what I wanted to learn.

@penzn
Copy link
Contributor

penzn commented May 8, 2023

Igalia's work adding support for tables is very similar to what would need to happen to support multiple memories as well.

What does this work look like? LLVM has addrspace attribute which in some cases has exactly the same meaning as multiple memories (think OpenCL before version 2), though I've also read about supporting GC object using that.

@titzer
Copy link
Contributor Author

titzer commented May 8, 2023

One of the more compelling use cases I've stumbled on is virtualizing interfaces that use memory. E.g. implementing a Wasm module that has an imported memory from the "user", which it may read and/or write, and then a private memory that is used to store additional internal state and possibly communicate with other modules.

AFAICT It would be possible to write such a module in C with address space annotations.

@chenzhuofu
Copy link

chenzhuofu commented May 8, 2023

yes, that’s what I want to learn.
How does this addr space annotations work?

One of the more compelling use cases I've stumbled on is virtualizing interfaces that use memory. E.g. implementing a Wasm module that has an imported memory from the "user", which it may read and/or write, and then a private memory that is used to store additional internal state and possibly communicate with other modules.

AFAICT It would be possible to write such a module in C with address space annotations.

@penzn
Copy link
Contributor

penzn commented May 8, 2023

How does this addr space annotations work?

With Clang and C/C++ it is __attribute__((address_space(N))) before the type, though the N for the purposes of multiple memories needs to be a constant.

Example:

int incr_from_mem3(__attribute__((address_space(3))) int * ptr) {
  return (*ptr) + 1;
}

(Edit) Even though this would lead to addrspace in the LLVM IR, Wasm backend would quietly ignore it at the moment, though it should not be too hard to enable that.

@chenzhuofu
Copy link

How does this addr space annotations work?

With Clang and C/C++ it is __attribute__((address_space(N))) before the type, though the N for the purposes of multiple memories needs to be a constant.

Example:

int incr_from_mem3(__attribute__((address_space(3))) int * ptr) {
  return (*ptr) + 1;
}

(Edit) Even though this would lead to addrspace in the LLVM IR, Wasm backend would quietly ignore it at the moment, though it should not be too hard to enable that.

I see, thanks for explanation.

@tlively
Copy link
Member

tlively commented May 8, 2023

Since address spaces need to be statically allocated by the LLVM backend for WebAssembly, it would not be scalable to try to use them to support multiple memories directly. Tables are modeled in LLVM IR as global arrays in a special address space so that an arbitrary number of them may be created. The Wasm object file format used with LLVM was also extended with additional relocation types for tables. The same patterns would work well for modeling multi-memory as well.

@dschuff
Copy link
Member

dschuff commented May 8, 2023

I actually find that take somewhat surprising; given that address spaces also need to be statically allocated in the wasm module, requiring the same static allocation at the LLVM IR level seems like it should scale exactly as well in LLVM as it would in wasm itself? Tables are different in the sense that there's not really any obvious analog in the IR already (not just for tables, but also for the references they contain).

@penzn
Copy link
Contributor

penzn commented May 9, 2023

I am going to second what @dschuff said, aren't memories statically declared, why would they need to get the same dynamic treatments tables get?

@tlively
Copy link
Member

tlively commented May 9, 2023

By "statically allocated in the backend," I mean statically allocated when LLVM is compiled, not when the user program is compiled. So if you had a 1:1 mapping between address spaces and memories, then when you compile LLVM, you would have to determine what the maximum number of memories an LLVM IR module could reference at that point. In contrast, the scheme used for tables allows user programs to use an arbitrary number of tables.

@titzer
Copy link
Contributor Author

titzer commented May 9, 2023

Is this discussion is just about the LLVM internal representation? At the C or C++ level these would still be address space annotations on pointer types?

@penzn
Copy link
Contributor

penzn commented May 9, 2023

So if you had a 1:1 mapping between address spaces and memories, then when you compile LLVM, you would have to determine what the maximum number of memories an LLVM IR module could reference at that point.

There is a hard limit on number of memories, memory index is one byte, I think.

@tlively
Copy link
Member

tlively commented May 9, 2023

Is this discussion is just about the LLVM internal representation? At the C or C++ level these would still be address space annotations on pointer types?

At the C or C++ level these would most likely be new annotations like __attribute__((wasm_memory)), since clang would also have to check a bunch of semantic restrictions (such as ensuring that the arrays are not address-taken) just like it does for tables.

There is a hard limit on number of memories, memory index is one byte, I think.

No, just like all other indices in Wasm, memory indices are LEB128 values.

@titzer
Copy link
Contributor Author

titzer commented May 9, 2023

At the C or C++ level these would most likely be new annotations like __attribute__((wasm_memory)), since clang would also have to check a bunch of semantic restrictions (such as ensuring that the arrays are not address-taken) just like it does for tables.

Oh, so you mean they would be globally-declared (non-address taken) arrays into which the program would index with integers?

@tlively
Copy link
Member

tlively commented May 9, 2023

Yes, exactly.

@yamt
Copy link
Contributor

yamt commented Jun 20, 2024

Is this discussion is just about the LLVM internal representation? At the C or C++ level these would still be address space annotations on pointer types?

At the C or C++ level these would most likely be new annotations like __attribute__((wasm_memory)), since clang would also have to check a bunch of semantic restrictions (such as ensuring that the arrays are not address-taken) just like it does for tables.

"not address-taken" sounds like a very severe restriction for memory
as C/C++ applications usually access memory via pointers.
i suspect it's worse than having a static limit on the number of memories.
am i missing something?

@tlively
Copy link
Member

tlively commented Jun 20, 2024

It's definitely a severe restriction compared to what you can do with other constructs in C/C++, but that's ok because a program would only need to use this feature to do something very specific to WebAssembly, and in that case having the source language construct match the underlying construct as closely as possible is a good thing.

@yamt
Copy link
Contributor

yamt commented Sep 4, 2024

It's definitely a severe restriction compared to what you can do with other constructs in C/C++, but that's ok because a program would only need to use this feature to do something very specific to WebAssembly, and in that case having the source language construct match the underlying construct as closely as possible is a good thing.

where did your assumption "a program would only need to use this feature to do something very specific to WebAssembly" come from?
i feel it's false in general as i've heard people wanting to be able to "just" annotate and re-compile their existing libraries to make it operate on non-default memory addresses.

@tlively
Copy link
Member

tlively commented Sep 4, 2024

i've heard people wanting to be able to "just" annotate and re-compile their existing libraries to make it operate on non-default memory addresses.

But that’s not something you can do in portable C/C++, so it only makes sense to expect to be able to do that if you’re targeting WebAssembly (or some other specific platform that could provide similar functionality).

@yamt
Copy link
Contributor

yamt commented Sep 12, 2024

i've heard people wanting to be able to "just" annotate and re-compile their existing libraries to make it operate on non-default memory addresses.

But that’s not something you can do in portable C/C++, so it only makes sense to expect to be able to do that if you’re targeting WebAssembly (or some other specific platform that could provide similar functionality).

right.
my point is that there could be a middle ground which is more convenient to users than the extremes like "portable C" and table-like accessors.

@tlively
Copy link
Member

tlively commented Sep 12, 2024

To turn this around, what would it even mean to take the address of something that lowers to a WebAssembly memory? How do you envision the address would be represented and how do you envision it could be used?

i've heard people wanting to be able to "just" annotate and re-compile their existing libraries to make it operate on non-default memory addresses.

In this case you can just compile the library normally to use a single memory, then use something like wasm-merge to merge it into the rest of the application, which would have a different memory. If you need to copy data from one memory to the other on the boundary, you could use the __attribute__((wasm_memory)) feature described above to write glue code that does the copy.

@yamt
Copy link
Contributor

yamt commented Sep 12, 2024

To turn this around, what would it even mean to take the address of something that lowers to a WebAssembly memory? How do you envision the address would be represented and how do you envision it could be used?

i don't pretend to have a clear vision!

although i'm not an llvm expert, address spaces seems like the closest construct llvm currently has.
as you said it has its drawbacks though.

i've heard people wanting to be able to "just" annotate and re-compile their existing libraries to make it operate on non-default memory addresses.

In this case you can just compile the library normally to use a single memory, then use something like wasm-merge to merge it into the rest of the application, which would have a different memory. If you need to copy data from one memory to the other on the boundary, you could use the __attribute__((wasm_memory)) feature described above to write glue code that does the copy.

maybe.
i suspect users likely want the library to place some part of its data (eg. c stack) on the default memory though.

@lum1n0us
Copy link

👋 Is the work on supporting the toolchain still in progress? It appears to be non-functional at the moment. https://github.com/search?q=repo%3Allvm%2Fllvm-project+multimemory&type=code

@dschuff
Copy link
Member

dschuff commented Sep 25, 2024

I don't think anyone is working on it right now.
As you can see from the discussion above, it's not yet really clear/agreed about exactly how this should work, either in LLVM IR or in C/C++. I think if someone wanted to push it forward, the right way would be to collect the relevant use cases (from the user and source language perspective), and then we can talk about what language primitives would support those, and then the LLVM design required to implement that would be more clear.

@lum1n0us
Copy link

lum1n0us commented Oct 8, 2024

We have a requirement for an embedded product that needs support for multiple memories. This product has two types of memory: a general memory and an on-chip memory. The code, which will be compiled into WASM from C, should be able to access variables in both types of memories.

__attribute__((address_space(4))) int blank_array[1024];
void fill_array(__attribute__((address_space(4))) int* x, int n, int v) {
  for(int i = 0; i < n; i++) {
    x[i] = v;
  }
}

blank_array is a buffer in the on-chip memory. fill_array() will use a variable in the general memory v to fill the buffer.


If go deeper, variables in different memories

__attribute__((address_space(4))) f32_t gvInA[LENGTH];
__attribute__((address_space(4))) f32_t gvInB[LENGTH];
f32_t gvInC[LENGTH];
f32_t gvInD[LENGTH];

gvInA[0] = 0.3;
gvInB[0] = 0.4;
gvInC[1] = 0.1;
gvInD[2] = 0.2;

will be marked with addrspace or local_unnamed_addr in IR

@gvInA = addrspace(4) global [2048 x float] zeroinitializer, align 4
@gvInB = addrspace(4) global [2048 x float] zeroinitializer, align 4
@gvInC = local_unnamed_addr global [2048 x float] zeroinitializer, align 4
@gvInD = local_unnamed_addr global [2048 x float] zeroinitializer, align 4

store float 0x3FD3333340000000, ptr addrspace(4) @gvInA, align 4, !tbaa !3
store float 0x3FD99999A0000000, ptr addrspace(4) @gvInB, align 4, !tbaa !3
store float 0x3FB99999A0000000, ptr getelementptr inbounds ([2048 x float], ptr @gvInC, i32 0, i32 1), align 4, !tbaa !3
store float 0x3FC99999A0000000, ptr getelementptr inbounds ([2048 x float], ptr @gvInD, i32 0, i32 2), align 4, !tbaa !3

Eventually, variant sections.

.globl gvInA
.size gvInA, 8192
.type gvInA,@object
.globl gvInB
.size gvInB, 8192
.type gvInB,@object
.size .Lstr, 11
.type .Lstr,@object
.size .L.str, 19
.type .L.str,@object
.size .L.str.1, 21
.type .L.str.1,@object
.globl gvInC
.size gvInC, 8192
.type gvInC,@object
.globl gvInD
.size gvInD, 8192
.type gvInD,@object

.section .vecmem_data,"aw",@progbits
.align 4
gvInA:                                  ; @0x0
.skip 8192
.align 4
gvInB:                                  ; @0x2000
.skip 8192
.section .rodata_in_data,"aw",@progbits
.align 4
.Lstr:                                  ; @0x0
.asciz "Test ended"
.align 4
.L.str:                                 ; @0xc
.asciz "Sample Length: %d\n"
.align 4
.L.str.1:                               ; @0x20
.asciz "Measured SNR: %f dB\n"
.section .bss,"aw",@nobits
.align 4
gvInC:                                  ; @0x0
.skip 8192
.align 4
gvInD:                                  ; @0x2000
.skip 8192

mov_s %r19,gvInA              ; @0x1c6
asl %r16,%r16,10            ; @0x1cc
mov_s %r15,%r19               ; @0x1d0
mov_s %r0,gvInD+8             ; @0x1d2
mov_s %r1,gvInC+4             ; @0x1d8
mov %r13,1024               ; @0x1de
mov_s %r17,0x3e100000@u32     ; @0x1e2
st 0x3e99999a@u32,[%r19,0] ; @0x1e8
st 0x3e4ccccd@u32,[%r0,0]  ; @0x1f0
st 0x3dcccccd@u32,[%r1,0]  ; @0x1f8

@dschuff
Copy link
Member

dschuff commented Oct 11, 2024

I thought about this a little more.
The mockup that @lum1n0us posted looks basically like what I would expect if we were going to line up the existing C and LLVM address space support to wasm in the most straightforward way, and it sounds like this is more or less the same use cases that @titzer and @penzn mentioned.
@tlively pointed out that it has a limitation; namely that the number of possible memories that a module can use is limited by how we allocate the "space" of LLVM address spaces (i.e. now many address space numbers in LLVM IR that we set aside for this). And he suggested that we take an approach more like what @pmatos did for wasm tables; i.e. instead of using regular pointers with load and store instructions, table operations use builtins and intrinsics for table.get and table.set.
The more I think about this idea, the less it makes sense to me.

  1. Unlike tables and references (which are very weird from the C and LLVM perspective), pointers into non-default address spaces behave very similarly to regular pointers (in terms of arithmetic, dereferencing, etc) and are well-modeled by LLVM, including attributes and variants such as volatile and atomic. If we reinvent pointer operations via intrinsics we will have to reinvent all of that (a much larger space of operations than we have for tables and references). IIRC the original attempt at tables and references hoped to model them as loads and stores and then figure out via analysis which ones needed to turn into table.get/set in the backend, but had to abandon that due to the difficulty of figuring that out because of all the optimizations that did things to the pointers. In this case, all that logic already in LLVM works for us rather than against us.
  2. The suggested use case of using wasm-merge to smash modules together doesn't work for all use cases. e.g. you might want to have a module's pointers default to memory 2, but still put its allocas in memory 1 (this is probably what you'd want for something like @lum1n0us's use case of a secondary device memory, if you wanted to compile an entire source file to operate on such memory). Binaryen may not be able to tell after link time which pointers are stack pointers, but LLVM seems to have builtin support for this separation. (Of course for other use cases wasm-merge will work fine either way)
  3. I think we can mitigate the limitation on the number of memories in use by making the memory index field relocatable. Then we can have the address spaces be symbolic rather than numeric, and the linker can resolve an unlimited number of them at link time. Then even though there will still be a limitation on the number of spaces, it will only apply to an object file rather than a linked binary. gcc even has named address spaces in addition to numbered, so we could copy that and reflect this model directly into the source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants