Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZCF does not enforce start completion in first incarnation #10174

Open
mhofman opened this issue Sep 30, 2024 · 3 comments
Open

ZCF does not enforce start completion in first incarnation #10174

mhofman opened this issue Sep 30, 2024 · 3 comments
Labels
bug Something isn't working contract-upgrade Zoe Contract Contracts within Zoe Zoe package: Zoe

Comments

@mhofman
Copy link
Member

mhofman commented Sep 30, 2024

Describe the bug

When implementing vat upgrades through baggage and durable kinds (#4325 and #3062), liveslots required that if buildRootObject() returned a promise, that it settled immediately (before end of crank). This was done to ensure all behavior was defined, or redefined in an upgrade without waiting on external call results. Aka a "one shot" upgrade so that pending messages to the vat didn't need to be embargoed.

When ZCF was first designed, this mechanism didn't exist, and concerns about cost of creating vats lead towards a Zygote design (#2268). Bundle caps also didn't exist or were not allowed in vatParameters (#4381)

This resulted in a design where the ZCF bundle is used to create an "empty shell" of a contract vat, and in a second delivery, the bundle of the contract is sent to that shell and the contract is started. However in an upgrade, the contract start is called as part of ZCF's buildRootObject(), which awaits the contract start.

That means a contract start is not required to complete within its crank on the first incarnation, but is required to in future incarnation, resulting in potential hazards that only reveals itself on upgrade.

To Reproduce

Deploy and instantiate a Zoe contract that await an external call in its start function, then attempt to upgrade that contract.

Expected behavior

The contract instantiation should fail in the first place. None of its effects (message sends) should be performed (crank rollback).

Description of the Design

The cleanest and preferred solution for now is to abandon the ability to make zcf zygotes, and for Zoe startInstance to pass the needed contract bundle and its creation params to vatParameters so that zcf can start the contract within its own buildRootObject even on first start. The necessary work for this was already done for contract upgrades (#4381)

Alternatives that were considered:

  • Have liveslots expose a power accepting a promise "must settle immediately or fail the crank". This may have uses outside this issue, but does require liveslots changes. Preferably the crank should be rolled back to undo any partial start effects
  • enforce this from the zoe side: zoe would queue both a "start" message to zcf's root object, and a "ping" message to the same object. Our execution model will always guarantee that the messages would be delivered in order, and if the responses were immediate, they would also be delivered in order. As such zoe can race the result promises, and "fail" if the ping result arrives before the start result. This does not allow undoing partial effects of the contract's start.

This might be considered within the context of improved upgrade protocol for vats in general (#4382)

Security Considerations

If we were adding a new liveslots power, question of limiting who has access to it.

Scaling Considerations

This would effectively remove the temporal point where we could create a zcf zygote, but that has never been implemented and the perf benefits were never measured.

Test Plan

An upgrade test based on the reproduction steps above.

Upgrade Considerations

This would require a new Zoe and ZCF bundle, and thus either a core eval or a chain software upgrade.

While this removes a footgun for contracts, any contract that currently (ab)used this possibility would fail to start after these changes are deployed. It would however not affect upgrades of these contracts as single crank start was already enforced in those cases.

@mhofman mhofman added bug Something isn't working Zoe package: Zoe Zoe Contract Contracts within Zoe contract-upgrade labels Sep 30, 2024
@Chris-Hibbert
Copy link
Contributor

A contract that fell prey to this footgun would not be able to do a null-upgrade, but full upgrades would be possible with a re-write that removed the problematic code. For the near term, we expect to enforce by process that all upgrades are tested in a3p or a testNet before getting pushed to mainNet.

The footgun is serious, but I don't think the fix is needed urgently.

@mhofman
Copy link
Member Author

mhofman commented Sep 30, 2024

FWIW, this was motivated by #10140 (comment), which may require a completely different design. While it's always possible to transition in upgrades, it's now always clean given you have to maintain the kinds that were previously defined, and potentially the implementations of these kinds if the objects have been shared.

@mhofman
Copy link
Member Author

mhofman commented Oct 2, 2024

It looks like we have an agreement on dropping potential support as currently enabled. I updated the issue description to reflect that preferred approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working contract-upgrade Zoe Contract Contracts within Zoe Zoe package: Zoe
Projects
None yet
Development

No branches or pull requests

2 participants