Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mock-consensus: 1️⃣ report non-zero status codes as errors #3876

Closed
cratelyn opened this issue Feb 23, 2024 · 4 comments
Closed

mock-consensus: 1️⃣ report non-zero status codes as errors #3876

cratelyn opened this issue Feb 23, 2024 · 4 comments
Assignees
Labels
A-mock-consensus Area: Relates to the mock consensus engine C-bug Category: a bug

Comments

@cratelyn
Copy link
Contributor

in #3866, i noticed that a non-zero status code in the response is not reported as an error in tests. see this log:

2024-02-23T02:55:09.755882Z  INFO penumbra_app::server::consensus: deliver_tx failed, e: binding signature failed to verify

Caused by:
Invalid signature.
at crates/core/app/src/server/consensus.rs:210
in penumbra_mock_consensus::abci::deliver_tx
in penumbra_mock_consensus::block::execute with height: 1, time: 1708656909

we should make sure that this results in an error when calling BlockBuilder::execute.

@cratelyn cratelyn added A-mock-consensus Area: Relates to the mock consensus engine C-bug Category: a bug labels Feb 23, 2024
@cratelyn cratelyn self-assigned this Feb 23, 2024
@hdevalence
Copy link
Member

No, I think that's the correct behavior — DeliverTx is fallible. Blocks can include transactions that fail to execute (and this has to be the case, because prior to PrepareProposal there's no way for the application to control transaction inclusion).

@cratelyn
Copy link
Contributor Author

do you think that this should include the BlockBuilder::execute helper though? what i'm worried about with this issue is that, as part of my work on #3866, i noticed that the spend transaction was failing silently.

i expect that in the majority of "happy path" cases, we would want to know about a transaction that failed to execute, right?

for test cases that do want to send an invalid transaction to the app, they could still rely on the abci submodule, and send the deliver_tx request themselves, inspecting the response to confirm that e.g. it was ruled invalid as expected.

@hdevalence
Copy link
Member

I'm not sure — and I'm not sure we should overthink it, since if we move to current ABCI this interface all changes anyways.

One question I have is, why is it important for a test to know whether the transaction succeeded or not? Wouldn't we want to be writing assertions about the effects of the transaction, rather than reliance on a self described success code? And in that case do we need this anyways?

For instance, in the spend case, wouldn't we want to write an assertion that, following the block, the nullifier is present in the state, that the MockClient detects the new output note, etc?

@cratelyn
Copy link
Contributor Author

cratelyn commented Feb 23, 2024

my guiding principle here is that testing tools should report errors eagerly, to make testing happy paths more ergonomic. to your point, swallowing this error, while the correct choice in production to keep the application running,

Well, it's less about whether it's the correct choice in one or the other context, and more that it's the behavior specified by ABCI. A failed transaction is not an error. It's an Ok(I rejected this), at least according to ABCI. The errors bubbled up from the consensus harness should match its expected behavior.

... places the burden on test writers to write these assertions. we should write assertions about the outcome of a transaction (_it's part of the benefits you outlined here, but if it is a mandatory part of writing tests, it raises the burden of doing so, and lowers the ceiling of how much we can will test.

Yes, to be clear, I absolutely want this burden to be placed on test writers. It is required for the tests to be meaningful. Our goal is to ensure that when we do X, we see effect Y. The value of the test is having to write down an executable specification of what the expected behavior is.

if effects have to be manually inspected though, i fear that we might run into false positives. i want to avoid an interface that leads people to believe they are providing assurance, where an error isn't being surfaced. i only discovered that the spend wasn't being performed by running tests with --no-capture set.

But this is even more true if we have an expectation that tests don't specify their invariants. While you discovered that the spend wasn't being performed by seeing a return code, trusting the return code is also a false sense of assurance. What if the handler didn't write the nullifier properly and still reported that execution succeeded?

does propagating errors when execute is called prevent us from writing any particular tests? are invalid transactions part of the application's regular operation?

Yes, invalid transactions are expected to be handled by any ABCI application, including ours. We should not be defining our own new set of semantics. We don't get benefit from doing so, because it's already handled by the stronger assertions we need to be writing, and because this API is one we want to migrate away from anyways.

@cratelyn cratelyn closed this as not planned Won't fix, can't repro, duplicate, stale Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mock-consensus Area: Relates to the mock consensus engine C-bug Category: a bug
Projects
Archived in project
Development

No branches or pull requests

2 participants