Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement limiting WebAssembly execution with fuel #2611

Merged
merged 11 commits into from
Jan 29, 2021

Conversation

alexcrichton
Copy link
Member

This PR lifts a feature from Lucet to wasmtime where generated code can count instructions or account for "fuel" during execution. The purpose of this feature is similar to the interrupt support via InterruptHandle, but it mainly allows deterministically interrupting a wasm module instead of relying on a timer. Additionally a future goal of this PR is to extend the async support of wasmtime to leverage fuel to periodically "interrupt" executing wasm code to yield back to the host. This would enable wasmtime futures to never take "too long" in Future::poll since if they would otherwise take awhile they'd yield back to the host and allow preemption and/or other things like future timeouts.

Thee implementation here is nearly copied verbatim from Lucet itself, with tweaks as appropriate for the different vmctx representation in Wasmtime. The main difference is that Wasmtime's fuel counter is two levels of indirection away from the vmctx rather than one in Lucet. To help with this a new Variable stores the VMInterrupts pointer value to avoid reloading the same value each time from the vmctx.

Support for this feature is exposed through a few new APIs:

  • Config::consume_fuel - enables codegen options for wasm to consume fuel, and behaves similar to interruptable.
  • Store::set_fuel_remaining - this is how fuel is injected into a Store for execution of wasm. Note that stores always start with 0 fuel so this is required to be called.
  • Store::fuel_consumed - this can be used to check how much fuel has been consumed so far.

The current behavior, which cannot be changed, is that when fuel runs out a wasm trap is generated. I hope to make this configurable in the future so that for async stores when fuel runs out it's automatically re-injected with fuel but only after a yield back to the host happens.

I've done a bit of benchmarking with this using criterion and the benchmarks here -- https://github.com/bytecodealliance/sightglass/tree/main/benchmarks-next. The benchmarks are relatively limited at this time but were able to produce some useful data in the meantime. This shows to be a 35-45% slowdown on my personal laptop for the runtime execution of the benchmarked porttion of the code for blake3-scalar and shootout-ackermann. At least for ackermann this is somewhat expected because the loops/function calls are all tiny, so the overhead is quite noticeable. For blake3-scalar I assume it's similar but haven't dug in yet. Note that these numbers were with the new backend since the old x86 backend seems significantly worse than the x64 one.

I do think there might be some relatively low-hanging fruit with respect to performance, but further tweaks would require changes to cranelift itself to optimize instruction selection. For example one optimization might be to not have a fuel_var and instead periodically do addq $fuel_consumed, offset(%vminterrupts_ptr) which avoids consuming extra registers. Similarly cmpq $0, offset(%vminterrupts_ptr) could be generated as well. I couldn't get the x64 backend to emit those forms of instructions at this time though. I'm also not 100% certain that it'll be faster.

Note that for now this doesn't depend on the async PR, but I plan on having a future PR after these two land which implements the periodically-yield option.

@cfallin
Copy link
Member

cfallin commented Jan 27, 2021

I'll take a look -- thanks for putting this together so quickly! A few quick notes on the isel thoughts:

For example one optimization might be to not have a fuel_var and instead periodically do addq $fuel_consumed, offset(%vminterrupts_ptr) which avoids consuming extra registers. Similarly cmpq $0, offset(%vminterrupts_ptr) could be generated as well

The former is an RMW compound op and could be done pretty reasonably with the current load-op merging framework; we haven't gotten around to it yet, but it's clear how to do so if we decide it's important. The latter is actually a bit more interesting: I recently fixed a bug (#2576) that arose because of certain combinations of load-and-compare that would sink a load beyond its first use by simply disallowing compare-from-memory. That too can be re-optimized but would take more machinery.

Anyway, all that to say that there are reasons you couldn't get either of those instructions to come out the backend :-)

@alexcrichton
Copy link
Member Author

Oh also to clarify, I don't think it's just isel that can be improved, but rather I think the frontend codegen could also be improved if we decide the time should be invested. For example an empty function (func nop) generates this x86 code with the new backend (annotated by me):

0000000000000000 _wasm_function_4:
       ;; prologue stack-check to ensure we're not out of stack space
       ;; note that this is only here because function calls are made internally
       0: 55                           	pushq	%rbp
       1: 48 89 e5                     	movq	%rsp, %rbp
       4: 4c 8b 17                     	movq	(%rdi), %r10
       7: 4d 8b 12                     	movq	(%r10), %r10
       a: 49 39 e2                     	cmpq	%rsp, %r10
       d: 0f 86 02 00 00 00            	jbe	2 <_wasm_function_4+0x15>
      13: 0f 0b                        	ud2

      ;; allocating stack space, saving registers
      15: 48 83 ec 10                  	subq	$16, %rsp
      19: 4c 89 24 24                  	movq	%r12, (%rsp)

      ;; load vminterrupts_ptr into r12
      1d: 4c 8b 27                     	movq	(%rdi), %r12

      ;; increment the fuel consumed by 1 where 8(%r12) is where the previous count is
      20: be 01 00 00 00               	movl	$1, %esi
      25: 49 03 74 24 08               	addq	8(%r12), %rsi

      ;; check if fuel is gerater than zero
      2a: 48 31 c0                     	xorq	%rax, %rax
      2d: 48 39 c6                     	cmpq	%rax, %rsi
      30: 0f 8c 13 00 00 00            	jl	19 <_wasm_function_4+0x49>

      ;; save the fuel count, load the intrinsic address, call it
      36: 49 89 74 24 08               	movq	%rsi, 8(%r12)
      3b: 48 8b b7 f0 06 00 00         	movq	1776(%rdi), %rsi
      42: ff d6                        	callq	*%rsi

      ;; restore the fuel_var for this function (unnecessary instruction, happens uncondititonally after call)
      44: 49 8b 74 24 08               	movq	8(%r12), %rsi
    
      ;; save fuel_var at the end of the function (unnecessary instruction, happens unconditionally at function end)
      49: 49 89 74 24 08               	movq	%rsi, 8(%r12)

      ;; function epilogue
      4e: 4c 8b 24 24                  	movq	(%rsp), %r12
      52: 48 83 c4 10                  	addq	$16, %rsp
      56: 48 89 ec                     	movq	%rbp, %rsp
      59: 5d                           	popq	%rbp
      5a: c3                           	retq

Larger functions like https://github.com/bytecodealliance/sightglass/blob/faf101e50f7af6243fe3e910cf9455eecd9e1368/benchmarks-next/shootout-ackermann/benchmark.wat#L25-L49 end up having a lot more code generated with instrumentation enabled. I haven't yet gone through instruction-by-instruction yet, though, to see what could be improved.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:wasm fuzzing Issues related to our fuzzing infrastructure wasmtime:api Related to the API of the `wasmtime` crate itself labels Jan 27, 2021
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen, @peterhuene

This issue or pull request has been labeled: "cranelift", "cranelift:wasm", "fuzzing", "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing
  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very nice overall!

My main question below has to do with the way in which we actually handle out-of-fuel conditions, and whether this implementation as-is allows us to resume to do timeslicing. It's possible that more will come later, in which case, I'm perfectly happy to see this landed now; just want to understand how things will (eventually) work.

pub fn set_fuel_remaining(&self, remaining: u64) {
assert!(self.engine().config().tunables.consume_fuel);
let remaining = i64::try_from(remaining).unwrap();
self.inner.fuel_adj.set(remaining);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason we can't add to the existing fuel_adj value here, in order to continue accumulating the consumed-fuel count and return the true total from fuel_consumed()?

(In that case I might also call this add_fuel(), and adjust existing rather than overwrite the fuel-consumed counter...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a tiny bit worried that a long-lived store might overflow the i64 counter but that may not be too realistic. Do you think that'd be rare enough that we should just switch this to an add instead of a set?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it's nicer to not have the "fuel-used value that we return is only since the last set" property -- it has the potential to become a subtle stats bug later.

Doing some quick math, a 2^63 max count, at 1B Wasm ops per second, gives us 2^33 or 8B seconds of runtime before overflow, which is ~250 years. Sometime before the year 2270 we can come back and upgrade to an i128 :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh good point!

crates/wasmtime/src/store.rs Show resolved Hide resolved
This commit adds codegen infrastructure necessary to instrument wasm
code to consume fuel as it executes. Currently nothing is really done
with the fuel, but that'll come in later commits.

The focus of this commit is to implement the codegen infrastructure
necessary to consume fuel and account for fuel consumed correctly.
This commit enables wasm code to periodically check to see if fuel has
run out. When fuel runs out an intrinsic is called which can do what it
needs to do in the result of fuel running out. For now a trap is thrown
to have at least some semantics in synchronous stores, but another
planned use for this feature is for asynchronous stores to periodically
yield back to the host based on fuel running out.

Checks for remaining fuel happen in the same locations as interrupt
checks, which is to say the start of the function as well as loop
headers.
The location of the shared interrupt value and fuel value is through a
double-indirection on the vmctx (load through the vmctx and then load
through that pointer). The second pointer in this chain, however, never
changes, so we can alter codegen to account for this and remove some
extraneous load instructions and hopefully reduce some register
pressure even maybe.
Use fuel to time out modules in addition to time, using fuzz input to
figure out which.
@alexcrichton
Copy link
Member Author

Ok this should be rebased and updated to have add_fuel

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:wasm cranelift Issues related to the Cranelift code generator fuzzing Issues related to our fuzzing infrastructure wasmtime:api Related to the API of the `wasmtime` crate itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants