polkavm: Add a knob to control sbrk instruction enablement #171

aman4150 · 2024-09-18T15:31:53Z

Currently, we enable sbrk instruction by default, however we do want to a mechanism to control it for the guest. This patch aims to just do that.

if sbrk is disabled, host should trap the guest program.

A basic test is also added that confirms the same.

Note, since we charge gas per compile block, and not executed instruction, in the test we add a fallthrough().

Fixes: #167

aman4150 · 2024-09-18T16:01:50Z

I think, the reason test failed on Compiler backend, vs passed on interpreter is that we have a slightly different logic of charging gas.

On Interpreter backend, we charge per basic block, therefore charge includes cost of fallthrough.
However, on Compiler backend, charge is calculated on execution of instruction.

We can handle this in test, but charge mismatch would still happen.

koute · 2024-09-19T05:52:23Z

In general we always charge gas on a basic block basis, and when we enter a basic block we charge the gas. This applies to both the recompiler and the interpreter, and they should charge in the same way, otherwise it's a bug. (All of the tests which start with tracing_ actually run both the recompiler and the interpreter at the same time and make sure they behave exactly the same.)

So, few points:

There should not be an fallthrough instruction in the test program. (The only reason why fallthrough exists is that we allow only jumps to the beginning of basic blocks, so you need some sort of a dedicated instruction to forcefully start a basic block if the place you want to jump to doesn't have a control flow instruction there already. In this case there are no jumps, hence no fallthrough is necessary.)
You still need to modify the interpter's compile_block. That function essentially goes through a whole basic block and calculates the gas cost for it, but right now it's not stopping on the sbrk, so the calculation's going to be wrong.
There's also an issue of jumps - if sbrk's going to be treated as a trap then the instruction after it should also be treated as a valid place to jump to, but ignore this for now. I'm currently working on something related to this so I'll clean it up later myself (this is probably currently subtly broken anyway in certain corner cases).

aman4150 · 2024-09-19T11:22:48Z

noted, thanks for the pointers!

It looks like GasVisitor do not have any module config passed to it, so I had to make start_new_basic_block public and call it from compile_block. Let me know if that doesn't work.

koute · 2024-09-20T05:28:24Z

crates/polkavm/src/interpreter.rs

+            #[allow(clippy::single_match)]
+            match instruction.opcode() {
+                polkavm_common::program::Opcode::sbrk => {
+                    if !self.module.allow_sbrk() {
+                        gas_visitor.start_new_basic_block();
+                        is_properly_terminated = true;
+                        break;
+                    }
+                }
+
+                _ => {}
+            }


Hm, instead of this why not do something like this at the start of the loop?

while let Some(mut instruction) = instructions.next() { if !self.module.allow_sbrk() && matches!(instruction.kind, Instruction::sbrk(..) { instruction.kind = Instruction::trap; }

I am happy to do it, but two reasons I did not do instruction replacement:

There is no precedence for instruction replacement pass in the interpreter code. Even for the instructions (unimplemented or dynamic paging flag gated instructions) where we could, we chose to do it in the InstructionVisitor.

You mentioned about next instruction to be a valid place for jump if sbrk is treated as trap. If we do instructions replacement without keeping track, we can't distinguish incorrect use of trap vs interpreter implicitly replacing some instruction with trap. Please correct if my understanding about this is incorrect.

There is no precedence for instruction replacement pass in the interpreter code.

But there is in the recompiler, no? Since that's exactly what you did there. (:

Even for the instructions (unimplemented or dynamic paging flag gated instructions) where we could, we chose to do it in the InstructionVisitor.

There's no deeper meaning to it; for e.g. loads/stores which have slightly different behavior depending on whether dynamic paging is enabled I did it in InstructionVisitor because that's where it was the most convenient to do (and was faster, because we've already dispatched the interpreter to a load/store handler so it doesn't have to check for every instruction whether dynamic paging is enabled).

I suppose the problem here is that the "does this instruction end the basic block?" condition is determined outside of the instruction handler itself.

In other words, in the recompiler it works roughly like this:

fn handle_instruction_x() { // ... start_new_basic_block(); } for instruction in instructions { instruction.visit(); }

So over there just proxying a handler for instruction X into a handler for instruction Y is enough to "redirect" the instruction X to behave exactly the same as instruction Y.

However, in the interpreter it works like this:

fn handle_instruction_x() { // ... } for instruction in instructions { instruction.visit(); if instruction == x { start_new_basic_block(); } }

So here just changing the handler is not enough; you either need to:
a) change the handler and special case it in the loop (as you did here),
b) or immediately "swap" the instruction at the beginning of the loop (as I'm proposing) and leave everything else alone.

So I think (b) is cleaner/simpler/more convenient, at least for now (and it replicates what the recompiler is doing).

We could maybe refactor the interpreter so that it works similar to the recompiler (as in - the handler would return whether the instruction terminates the basic block or not, so then you wouldn't have to explicitly test for it in the compile_block, which would be cleaner), but that's probably out of scope of this PR and we could do that later.

2. You mentioned about next instruction to be a valid place for jump if sbrk is treated as trap. If we do instructions replacement without keeping track, we can't distinguish incorrect use of trap vs interpreter implicitly replacing some instruction with trap. Please correct if my understanding about this is incorrect.

Well, there's no such thing as "incorrect use of a trap". (: Every "unknown" instruction (and if sbrk is disabled then it is essentially treated as unknown) is defined to be equivalent to a trap. Now, whether the program counter location after such a trap constitutes a valid jump target - that's still somewhat an open question. But again, this behavior in general is somewhat broken right now and I'm working on tightening/finalizing the semantics for this and refactoring/rewriting the code for it, so you can just ignore it for now. For now it's fine if a disabled sbrk behaves exactly the same as trap and allows jumps after it.

Makes sense. Thanks for the detailed write-up!

Currently, we enable sbrk instruction by default, however we do want to a mechanism to control it for the guest. This patch aims to just do that. if sbrk is disabled, host should trap the guest program. A basic test is also added that confirms the same. Signed-off-by: Aman <[email protected]>

aman4150 force-pushed the aman_sbrk_knob branch from 8e001a7 to 12d64e2 Compare September 18, 2024 15:36

aman4150 requested a review from koute September 18, 2024 18:04

aman4150 force-pushed the aman_sbrk_knob branch from 12d64e2 to 0616dd8 Compare September 19, 2024 11:21

aman4150 force-pushed the aman_sbrk_knob branch 2 times, most recently from b314adb to 447bfde Compare September 19, 2024 12:18

koute reviewed Sep 20, 2024

View reviewed changes

aman4150 force-pushed the aman_sbrk_knob branch from 447bfde to 036ed41 Compare September 20, 2024 11:15

koute approved these changes Sep 20, 2024

View reviewed changes

aman4150 enabled auto-merge (rebase) September 20, 2024 12:01

aman4150 force-pushed the aman_sbrk_knob branch from 036ed41 to 033eeb9 Compare September 20, 2024 12:01

aman4150 merged commit e905e98 into master Sep 20, 2024
8 checks passed

aman4150 deleted the aman_sbrk_knob branch October 8, 2024 08:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

polkavm: Add a knob to control sbrk instruction enablement #171

polkavm: Add a knob to control sbrk instruction enablement #171

aman4150 commented Sep 18, 2024

aman4150 commented Sep 18, 2024 •

edited

Loading

koute commented Sep 19, 2024

aman4150 commented Sep 19, 2024

koute Sep 20, 2024

aman4150 Sep 20, 2024

koute Sep 20, 2024 •

edited

Loading

aman4150 Sep 20, 2024

polkavm: Add a knob to control sbrk instruction enablement #171

polkavm: Add a knob to control sbrk instruction enablement #171

Conversation

aman4150 commented Sep 18, 2024

aman4150 commented Sep 18, 2024 • edited Loading

koute commented Sep 19, 2024

aman4150 commented Sep 19, 2024

koute Sep 20, 2024

Choose a reason for hiding this comment

aman4150 Sep 20, 2024

Choose a reason for hiding this comment

koute Sep 20, 2024 • edited Loading

Choose a reason for hiding this comment

aman4150 Sep 20, 2024

Choose a reason for hiding this comment

aman4150 commented Sep 18, 2024 •

edited

Loading

koute Sep 20, 2024 •

edited

Loading