Should runtimes have access to thread's stack boundaries? #12

loganek · 2022-12-04T21:21:47Z

At the moment runtimes can access stack boundaries reading by reading __heap_base/__data_end (or __stack_high/__stack_low) globals (if they're exported). That works for the main thread, but won't work for the threads we spawn with the new wasi_thread_spawn API.

Having access to stack boundaries and the stack pointer (which is also not straightforward to retrieve, but possible) would allow runtimes to detect stack overflow or underflow. However, that would slightly complicate the API (we'd probably have to add another parameter to wasi_thread_spawn).

I personally see a lot of value in having stack overrun (mainly overflow, but underflow can happen too) protection in VMs, but at the same time I'm in favor of having the API simple, at least for now.

I'd like to know others' opinion and see if I didn't miss anything obvious here.

Just a note here - POSIX's pthread_create allows to have a guard memory which prevents from stack overflow, but I don't think this is going to work on WASM as it's not possible to specify read-only memory segments (or is it?).

The text was updated successfully, but these errors were encountered:

sbc100 · 2022-12-04T22:07:51Z

In emscripten we have found it useful to be able to do bounds checking in userspace. For that, we use binaryen pass which we run over the compiled module: https://github.com/WebAssembly/binaryen/blob/main/src/passes/StackCheck.cpp. This adds instrumentation to any global.set __stack_pointer instructions, and should work with any module build by llvm (or that uses the __stack_pointer global the way llvm does).

Doing it this way means we don't depend on any features of the runtime (for example, v8 doesn't need to be aware of the shadow stack at all).

To make this work with threads does require the setting of the stack start/end globals during the startup of each thread, but this is something that perhaps wasi-libc can do in debug/checked mode?

sunfishcode · 2022-12-06T21:42:46Z

Wasm indeed doesn't currently have read-only memory segments. There's the beginning of a proposal, but it's early.

yamt · 2022-12-06T23:41:06Z

Wasm indeed doesn't currently have read-only memory segments. There's the beginning of a proposal, but it's early.

i don't think it's necessary for this particular case.

if enough info about the stack and associated "guard page" is provided to wasi_thread_spawn, an implementation of wasi_thread_spawn can just mprotect the guard page (host page, not wasm page) during the execution of wasi_thread_start. a tricky part is that it would need to make the stack allocation somehow aware of the host page size.

sbc100 · 2022-12-07T00:12:31Z

IIUC that would no longer be conforming to the wasm spec. For example, its perfectly valid in wasm to write a function like search_entire_memory_for_string(const char* needle). If you had a guard page inside the linear memory that function would trap. Even address zero is defined to accessible in wasm (unfortunately for the C compiler) so we can't even protect the zero page.

yamt · 2022-12-07T02:39:40Z

IIUC that would no longer be conforming to the wasm spec. For example, its perfectly valid in wasm to write a function like search_entire_memory_for_string(const char* needle).

sure. but it's what exactly we want to detect, isn't it?
after all, it's perfectly valid in wasm to overflow shadow stack.

in that case, i guess the only way "conforming" to the current spec is to insert explicit checks at C compiler level.
(iirc golang does something like this.)
C compiler knows which access is meant to stack. the compiled bytecode doesn't have the info anymore.

Even address zero is defined to accessible in wasm (unfortunately for the C compiler) so we can't even protect the zero page.

i agree it's unfortunate. but wasm is not the only environment having something at address 0.

yamt · 2022-12-07T04:14:44Z

In emscripten we have found it useful to be able to do bounds checking in userspace. For that, we use binaryen pass which we run over the compiled module: https://github.com/WebAssembly/binaryen/blob/main/src/passes/StackCheck.cpp. This adds instrumentation to any global.set __stack_pointer instructions, and should work with any module build by llvm (or that uses the __stack_pointer global the way llvm does).

WAMR performs boundary checks by intercepting global.set __stack_pointer.
(thus i guess that the current implementation of wasi-libc wasi_thread_start would cause a trap there.)

loganek · 2022-12-07T09:42:46Z

WAMR performs boundary checks by intercepting global.set __stack_pointer.

Yes, and I think this check will have to be disabled for spawned threads.

I think the extra pass shared by @sbc100 seems like reasonable approach; @sbc100, do you think that's something we could integrate to LLVM and control through a compiler's command line argument?

sbc100 · 2022-12-07T15:32:35Z

WAMR performs boundary checks by intercepting global.set __stack_pointer.

Yes, and I think this check will have to be disabled for spawned threads.

I think the extra pass shared by @sbc100 seems like reasonable approach; @sbc100, do you think that's something we could integrate to LLVM and control through a compiler's command line argument?

Right now its part of wasm-opt (binaryen), which only has basic integration with clang today.

So you would need to run wasm-opt --check-stack-overflow a.wasm -o b.wasm as a post-link step, to get this feature. Does that seems like a reasonable approach for now?

loganek · 2022-12-07T19:02:54Z

@sbc100 thanks, I think that's enough for now. I wonder though if there's any plan to integrate it to LLVM in a (near) future?

sbc100 · 2022-12-07T19:17:04Z

There have be times when we have discussed deeper integration of binaryen with llvm, but I don't know of any immediate plans.

sbc100 · 2022-12-07T19:18:07Z

CC @kripken and @dschuff who have discussed deeper integration in that past.

kripken · 2022-12-07T19:31:17Z

Another option in the space of more Binaryen integration into LLVM could be a WASI port of Binaryen which would hopefully be very simple to use in relevant toolchains (just a wasm binary).

yamt · 2022-12-08T05:08:58Z

to me it seems simpler to tweak the api to provide the stack boundary info to wasi_thread_spawn (and probably wasi_thread_start) so that wasm-opt and/or WAMR can use it. how do you think?

To make this work with threads does require the setting of the stack start/end globals during the startup of each thread, but this is something that perhaps wasi-libc can do in debug/checked mode?

maybe wasm-opt can insert the globals and the code to update them by itself?

sbc100 · 2022-12-08T06:17:58Z

The --check-stack-overflow pass in wasm-opt generates start and end globals for the stack and also a function for setting them: https://github.com/WebAssembly/binaryen/blob/main/src/passes/StackCheck.cpp#L37.

Each new thread would need to somehow call this function as one of the very first thing is does. In emscripten we currently call this function from JS before we call the thread entry point, but perhaps we can figure out a way for new threads to call this function before doing anything else.

loganek changed the title ~~Should runtimes have access to stack boundaries?~~ Should runtimes have access to thread's stack boundaries? Dec 4, 2022

loganek mentioned this issue Dec 7, 2022

WASI threads support bytecodealliance/wasm-micro-runtime#1790

Closed

19 tasks

loganek mentioned this issue Jan 10, 2023

Define __stack_base and __stack_limit globals in debug mode for stack overflow detection WebAssembly/wasi-libc#380

Open

loganek mentioned this issue Jan 23, 2023

Document how to use WASI threads in AOT mode bytecodealliance/wasm-micro-runtime#1905

Merged

cimacmillan mentioned this issue Nov 7, 2023

wasi threads mpsc channel hang on sched_yield only on debug rust build rust-lang/rust#117440

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should runtimes have access to thread's stack boundaries? #12

Should runtimes have access to thread's stack boundaries? #12

loganek commented Dec 4, 2022

sbc100 commented Dec 4, 2022

sunfishcode commented Dec 6, 2022

yamt commented Dec 6, 2022

sbc100 commented Dec 7, 2022

yamt commented Dec 7, 2022

yamt commented Dec 7, 2022

loganek commented Dec 7, 2022

sbc100 commented Dec 7, 2022

loganek commented Dec 7, 2022

sbc100 commented Dec 7, 2022

sbc100 commented Dec 7, 2022

kripken commented Dec 7, 2022

yamt commented Dec 8, 2022

sbc100 commented Dec 8, 2022

Should runtimes have access to thread's stack boundaries? #12

Should runtimes have access to thread's stack boundaries? #12

Comments

loganek commented Dec 4, 2022

sbc100 commented Dec 4, 2022

sunfishcode commented Dec 6, 2022

yamt commented Dec 6, 2022

sbc100 commented Dec 7, 2022

yamt commented Dec 7, 2022

yamt commented Dec 7, 2022

loganek commented Dec 7, 2022

sbc100 commented Dec 7, 2022

loganek commented Dec 7, 2022

sbc100 commented Dec 7, 2022

sbc100 commented Dec 7, 2022

kripken commented Dec 7, 2022

yamt commented Dec 8, 2022

sbc100 commented Dec 8, 2022