Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should the memory layout look like for wasm modules? #81

Closed
alexcrichton opened this issue Mar 8, 2018 · 29 comments
Closed

What should the memory layout look like for wasm modules? #81

alexcrichton opened this issue Mar 8, 2018 · 29 comments

Comments

@alexcrichton
Copy link
Contributor

alexcrichton commented Mar 8, 2018

Currently wasm's linker, LLD, largely decides the memory layout for wasm modules and how items interact in the module's memory address space. We should either (a) decide that LLD's layout is ok or (b) figure out a different layout for ourselves and work with that!

Currently with LLD it looks like:

  • Data by default starts at address 1024
  • The stack starts one page (64k) above the end of data and grows down

That's quite a small stack! It looks like LLD has options to change the start of data --global-base but I don't see an option to change the stack. That probably needs to change! - turns out -z stack-size=1024

@Pauan
Copy link

Pauan commented Mar 8, 2018

Does LLD dynamically grow the stack at runtime? That might be why it's so small.

@alexcrichton
Copy link
Contributor Author

AFAIK, no, it'll just run into global data and then underflow (causing weird corruption errors most likely!)

cc @CryZe (from our discussion today)

@CryZe
Copy link

CryZe commented Mar 8, 2018

The stack-size flag is supposedly just -z stack-size according to the source.

And no, it doesn't grow the stack from what I've seen. That wouldn't really work anyway, as even if it would somehow move the stack over, it would run into dlmalloc's memory, so that would be very bad.

So yeah on stack overflow it just corrupts all of the data sections.

@Diggsey
Copy link

Diggsey commented Mar 8, 2018

Would it make more sense to put the stack at the bottom? That way if it overflows it will actually error (throw an exception in javascript's case) rather than corrupt stuff?

@CryZe
Copy link

CryZe commented Mar 8, 2018

Other things I've noticed while working on the wasm-to-rust compiler:

  • All of the addresses from 0 to 1024 are completely unused, as Alex mentioned in the OP. That seems odd, as there's no protection against accesses to a null ptr (and short offsets) in the first place, so why waste 1 KiB of data.
  • I wonder if we could make the stack size automatically fill up the page it is on. With the 1 KiB wasted + the 64 KiB default stack, we automatically need at least 2 WASM pages. However dlmalloc automatically assumes that it can't use any memory of the second page that the stack barely uses. Therefore the 63 KiB of the second wasm page here is entirely wasted. So I feel like the stack base should automatically round up to the end of that page possibly.

@CryZe
Copy link

CryZe commented Mar 8, 2018

@Diggsey yeah honestly that seems like a much better idea. I'm not sure how much of all of this is in our hands though.

@pepyakin
Copy link
Member

pepyakin commented Mar 8, 2018

I think, we don't have to have a 1024 bytes of memory just wasted for something. I think it has Emscripten roots.

Placing the stack at the bottom seems like a good idea. But when the last time I thought about it I realized that it would be a tradeoff.

I think the rationale of placing the data section is to make accesses to it more cheaper code-size wise, since I think the data section could have a quite a few places in which it accessed with a literal. And using numerically larger numbers make this values take more space, since literals use LEB128.

@alexcrichton
Copy link
Contributor Author

Ah nice catch @CryZe! I updated the OP with how to configure the stack size.

cc @sunfishcode and @sbc100, y'all may be interested in this as well. The current question is whether for LLD + Rust a memory layout like this may make more sense:

0                          global-base            page boundary
|                               |                      |
|        call stack space       |	global data    |
|		  		|		       |

Or, in other words, we'd have something like:

  • page boundary = align_up_to(sizeof(global data) + sizeof(call stack), page size)

And then the start of the stack would be something like:

  • global-base = align_down_to(page boundary - sizeof(global_data), 16)

Do y'all have thoughts on a memory layout like that for wasm modules?

@BigBigos
Copy link

Just a (hopefully useful) note from a passing-by inexperienced Rust developer.

Isn't a null pointer already reserved for things like Option<Box> None value? Also, there exists Unique::empty() used by things like Vec or Box, which is basically a null pointer + alignof T, meaning that a non-null pointer that is small (like literal 4) might still have a special meaning.

Unless these invariants are changed, I don't think you can allow any valid pointer to go too much down the address space. If you put the stack there, you cannot really automatically detect the cases where the pointer ends up being null (or Unique::empty()), as these will happen before an underflow - each stack allocation would need to check for this explicitly.

@alexcrichton
Copy link
Contributor Author

@BigBigos ah an excellent point! In that sense you'd expect that Some(&foo) was always Some, but if the address of foo, if on the stack, was then 0 it would cause problems (as it would print out None). While certainly quite unlikely, still plausible!

I think for other ones though with Unique::empty we'd be safe. It means that &Foo may have an address like 16, but all manufactured pointers at those locations have zero size so shouldn't be invalid (and valid pointers should otherwise continue to be valid I think?)

@BigBigos
Copy link

@alexcrichton not sure what you really mean about Unique::empty. It is defined to return a pointer that is never valid (i.e. never used by things that are dereferencable) but distinct from the null pointer.

It is implemented as null + alignof, so that this invalid pointer is at the very least aligned (not sure, but maybe it is needed for some sort of automatic pointer tagging, similar to how None automatically uses the null value). If we use i64 as an example, it has alignment of 8, so Unique<i64>::empty() returns pointer with an address of 8. It means that any valid i64 pointer that happens to have a value of 8 will be interpreted in a special way.

The users of Unique use the empty value to mark their pointers with some special semantics, e.g. Vec uses it to signify no allocation took place. In this case it should be safe, as Vec won't ever point to a stack-allocated memory, but I am not sure what the other users of Unique do with empty.

We could change the definition of empty for wasm (I assume it is possible to change its definition for a specific target) to null - alignof, i.e. to a very large number (in the case of Unique<i64> it would be 0xFFFFFFF8). I guess such pointers could still be encoded cheaply, as their signed magnitude is still small. We would then just have to make sure the page allocator never returns the highest page.

@alexcrichton
Copy link
Contributor Author

@BigBigos hm do we actually rely on the value after Unique was constructed? I figured we'd always statically dispatch on sizeof(T) in contexts like Vec<T> rather than looking at the value of the pointer. In any case switching it all to use the high bits of the address space may also be reasonable!

@BigBigos
Copy link

@alexcrichton I think (if not - I apologize) you are confusing what empty means. I think the name is misleading - the function doesn't return an empty Unique value (whatever it would mean, as Unique is just a wrapper for a pointer) but a special value that is distinct from any other valid value and from a null value and is documented as a dangling value.

Unique<T>::empty is implemented for any T, even ones (or especially for ones) that have sizeof(T) != 0. The example before used i64 which has sizeof(i64) = 8.

Vec<T> uses Unique<T>::empty as a special marker that means no allocation took place. It is, among others, used by Vec::new() to initialize its internal Unique value. It means that if the Unique that is internal to Vec has the empty value, Vec will assume it has capacity of zero (so the pointer will never be dereferenced) and it should not be passed to the heap deallocate function.

Most such uses in other languages (like C) use a null pointer in this case (null pointer == the value is unitialized). Rust uses Option<T> for such cases which is then de-sugared into the same null pointer in case of None, but it is desirable not to use the null value so that Option<Vec<T>> can use it instead. In the end, it is just that there are 2 distinct special values - null and Unique::empty - that have a special meaning depending on context and should never represent a valid dereferencable pointer. In the case of Option<Vec<T>>, null means None, and empty means Some(Vec::new()).

As said before, it is not really a problem for Vec as the heap allocations it uses won't ever point to a stack address, i.e. the internal Unique object cannot be empty when initialized from heap allocation. But Unique definition is general enough that it could possibly be used by some other mechanism that could potentially use pointers originating from the stack (disclaimer - I don't know of any).

@alexcrichton
Copy link
Contributor Author

@BigBigos oh certainly yeah makes sense!

. It is, among others, used by Vec::new() to initialize its internal Unique value. It means that if the Unique that is internal to Vec has the empty value, Vec will assume it has capacity of zero (so the pointer will never be dereferenced) and it should not be passed to the heap deallocate function.

Does Vec look at the pointer to determine its capacity though? I was under the impression that it initialized with Unique::empty() but then never looked at it again (relying on the capacity to indicate the validity of the pointer)

@BigBigos
Copy link

@alexcrichton You are right, Unique::empty is never compared - it is just a placeholder. I should have read the code instead of assuming.

If all the other users of Unique::empty behave similarly, then using its address as a valid pointer is not a problem. Though I am not sure if we want to depend on undocumented behavior of these, especially when Unique::empty is defined to return a dangling pointer - if we now create a valid pointer with the same representation as a dangling pointer, bad things might happen in the future if someone chooses to use it in a different way.

@alexcrichton
Copy link
Contributor Author

Heh indeed! Regardless the null pointer is also an issue

@CryZe
Copy link

CryZe commented Mar 13, 2018

For safety purposes I'd just keep the lower 16 bytes reserved (stack is 16 byte aligned, so it would have to be a multiple of 16). With that new memory layout we'd round up the stack size to fill up the remaining lower parts of the page anyway, so having the lower 16 bytes reserved isn't that problematic.

@fitzgen
Copy link
Member

fitzgen commented Apr 19, 2018

I think we should move the stack to the first part of memory, so that stack overflows are loud, and don't silently corrupt data. I'm imagining changing the layout to this, as mentioned up-thread:

|--- Stack ---|--- Data ---|--- Heap ...

Aside: it is unfortunate we can't ask wasm memory to effectively start at 64KiB, to simulate null guard pages...

@pepyakin

I think the rationale of placing the data section is to make accesses to it more cheaper code-size wise, since I think the data section could have a quite a few places in which it accessed with a literal. And using numerically larger numbers make this values take more space, since literals use LEB128.

Very good point! I wonder what kind of impact this will have...

Unsigned LEB128s greater than or equal to 1024 are already at least two bytes in size.

Unsigned LEB128s greater than or equal to a megabyte are at least three bytes in size.

Unsigned LEB128s greater than or equal to a mebibyte are four bytes in size.

So assuming we create a megabyte stack and place data after that, we have 48576 bytes before we jump from three to four bytes per LEB128 global reference.

Let's use twiggy-compiled-to-wasm as a test case. (I have no idea how typical it is or not).

It is a 603,705 bytes large wasm file:

$ wc -c wasm-api/twiggy_wasm_api_bg.wasm 
603705 wasm-api/twiggy_wasm_api_bg.wasm

And it has 21139 bytes of data:

$ wasm-objdump -x -j Data wasm-api/twiggy_wasm_api_bg.wasm | grep segment
 - segment[0] size=5756 - init i32=1024
 - segment[1] size=14806 - init i32=6784
 - segment[2] size=577 - init i32=21592

That is small enough that all the globals' addresses would only be three bytes when LEB128 encoded.

I don't have an easy was to count memory instructions that operate on this range without resorting to writing more sophisticated scripts, so I don't have an idea of how much of an impact it will ultimately. It is reassuring that a reasonably-sized program's data fits in the three byte range, though.

Note: I tried to use cargo rustc to provide arguments to the linker and, you know, actually test this memory layout and get real numbers, but I haven't been able to get that to work yet.

@fitzgen
Copy link
Member

fitzgen commented Apr 19, 2018

@alexcrichton what do we need to do to change the memory layouts for default builds? It seems to me like this "should" be a quick PR that benefits anyone trying to debug stack overflows, or potentially needing to debug stack overflows (ie everyone, and even more so because the stack is so small right now...).

@fitzgen
Copy link
Member

fitzgen commented Apr 19, 2018

It appears that lld has a fixed memory layout for wasm right now: https://github.com/llvm-mirror/lld/blob/9ca9c36542ba3466b42a52cae5166b22c0124bf0/wasm/Writer.cpp#L578-L649

@alexcrichton
Copy link
Contributor Author

I've filed an LLD bug about tweaking the layout here.

@sbc100
Copy link

sbc100 commented Apr 19, 2018

Happy to at least an option to lld. Maybe we should consider how much flexibility is needed thought. i.e. are there other layout switch we might want? Presumably we don't want to go as far a linker scripts and such?

@alexcrichton
Copy link
Contributor Author

@sbc100 ah this comment reminds me of one other possible tweak to the memory layout.

Currently our default dynamic memory allocator assumes that all existing memory in the wasm instance is unusable and only uses grow_memory to acquire more memory. It turns out though that most of the time the number of wasm pages is rounded up so there's some space after the stack/data which is basically sitting unused.

In that sense it might be most optimal to round up the placement of data to end as close to a page boundary as possible, maximizing what can be used for normal operation. I see that on newer version of LLD from what we're using there's a __data_end global so we may be able to tweak the default allocator to leverage that instead and use any extra space.

In any case this isn't necessarily a layout switch and moreso just optimization tweaks!

@sbc100
Copy link

sbc100 commented Apr 23, 2018

Yes, the linker should be telling the runtime exactly where to start the heap so there shouldn't be any wasted space. We export a global called __heap_base for this purpose. It points to the start of the heap.

@alexcrichton
Copy link
Contributor Author

Since opening the defualt stack size has increased and --stack-first has been added to LLD which I believe adequately covers this issue, so I'm going to close

@fitzgen
Copy link
Member

fitzgen commented May 8, 2018

We don't have the new lld in rust nightly yet, though. Is there any issue to track that?

@alexcrichton
Copy link
Contributor Author

@fitzgen I've opened rust-lang/rust#50543 to track the work to upgrading LLVM, and I've listed this on there (and implemented it in my branch)

@sbc100
Copy link

sbc100 commented May 8, 2018

I'm also planning on landing this patch to compress the reloction padding: https://reviews.llvm.org/D46416. Its unrelated, and currently requires passing -O2 to the linker, but I thought you guys might like it might help you remove the need for the wasm-gc tool?

@alexcrichton
Copy link
Contributor Author

Nice! I'll add that to our list of "need to make sure we update when we upgrade". We already pass -O2 and such to linkers automatically in release mode and it's just stubbed out for wasm-ld, so it'll be easy for us to add support!

but I thought you guys might like it might help you remove the need for the wasm-gc tool?

Oh no worries, the support added long ago with --gc-sections has long since obsoleted wasm-gc. It just didn't make the cutoff for LLVM 6 so we didn't manage to get it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants