mock-consensus: 1️⃣ report non-zero status codes as errors #3876

cratelyn · 2024-02-23T14:52:24Z

in #3866, i noticed that a non-zero status code in the response is not reported as an error in tests. see this log:

2024-02-23T02:55:09.755882Z  INFO penumbra_app::server::consensus: deliver_tx failed, e: binding signature failed to verify

Caused by:
Invalid signature.
at crates/core/app/src/server/consensus.rs:210
in penumbra_mock_consensus::abci::deliver_tx
in penumbra_mock_consensus::block::execute with height: 1, time: 1708656909

we should make sure that this results in an error when calling BlockBuilder::execute.

The text was updated successfully, but these errors were encountered:

hdevalence · 2024-02-23T16:46:55Z

No, I think that's the correct behavior — DeliverTx is fallible. Blocks can include transactions that fail to execute (and this has to be the case, because prior to PrepareProposal there's no way for the application to control transaction inclusion).

cratelyn · 2024-02-23T16:57:32Z

do you think that this should include the BlockBuilder::execute helper though? what i'm worried about with this issue is that, as part of my work on #3866, i noticed that the spend transaction was failing silently.

i expect that in the majority of "happy path" cases, we would want to know about a transaction that failed to execute, right?

for test cases that do want to send an invalid transaction to the app, they could still rely on the abci submodule, and send the deliver_tx request themselves, inspecting the response to confirm that e.g. it was ruled invalid as expected.

hdevalence · 2024-02-23T17:15:30Z

I'm not sure — and I'm not sure we should overthink it, since if we move to current ABCI this interface all changes anyways.

One question I have is, why is it important for a test to know whether the transaction succeeded or not? Wouldn't we want to be writing assertions about the effects of the transaction, rather than reliance on a self described success code? And in that case do we need this anyways?

For instance, in the spend case, wouldn't we want to write an assertion that, following the block, the nullifier is present in the state, that the MockClient detects the new output note, etc?

cratelyn · 2024-02-23T20:30:41Z

my guiding principle here is that testing tools should report errors eagerly, to make testing happy paths more ergonomic. to your point, swallowing this error, while the correct choice in production to keep the application running,

Well, it's less about whether it's the correct choice in one or the other context, and more that it's the behavior specified by ABCI. A failed transaction is not an error. It's an Ok(I rejected this), at least according to ABCI. The errors bubbled up from the consensus harness should match its expected behavior.

... places the burden on test writers to write these assertions. we should write assertions about the outcome of a transaction (_it's part of the benefits you outlined here, but if it is a mandatory part of writing tests, it raises the burden of doing so, and lowers the ceiling of how much we ~~can~~ will test.

Yes, to be clear, I absolutely want this burden to be placed on test writers. It is required for the tests to be meaningful. Our goal is to ensure that when we do X, we see effect Y. The value of the test is having to write down an executable specification of what the expected behavior is.

if effects have to be manually inspected though, i fear that we might run into false positives. i want to avoid an interface that leads people to believe they are providing assurance, where an error isn't being surfaced. i only discovered that the spend wasn't being performed by running tests with --no-capture set.

But this is even more true if we have an expectation that tests don't specify their invariants. While you discovered that the spend wasn't being performed by seeing a return code, trusting the return code is also a false sense of assurance. What if the handler didn't write the nullifier properly and still reported that execution succeeded?

does propagating errors when execute is called prevent us from writing any particular tests? are invalid transactions part of the application's regular operation?

Yes, invalid transactions are expected to be handled by any ABCI application, including ours. We should not be defining our own new set of semantics. We don't get benefit from doing so, because it's already handled by the stronger assertions we need to be writing, and because this API is one we want to migrate away from anyways.

cratelyn added A-mock-consensus Area: Relates to the mock consensus engine C-bug Category: a bug labels Feb 23, 2024

cratelyn self-assigned this Feb 23, 2024

cratelyn mentioned this issue Feb 23, 2024

Tracking issue: 🤝 mock consensus engine for App tests #3588

Closed

cratelyn closed this as not planned Won't fix, can't repro, duplicate, stale Feb 26, 2024

cratelyn mentioned this issue Feb 26, 2024

app: 💸 a mock consensus spend is performed #3891

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mock-consensus: 1️⃣ report non-zero status codes as errors #3876

mock-consensus: 1️⃣ report non-zero status codes as errors #3876

cratelyn commented Feb 23, 2024

hdevalence commented Feb 23, 2024

cratelyn commented Feb 23, 2024

hdevalence commented Feb 23, 2024

cratelyn commented Feb 23, 2024 •

edited by hdevalence

Loading

mock-consensus: 1️⃣ report non-zero status codes as errors #3876

mock-consensus: 1️⃣ report non-zero status codes as errors #3876

Comments

cratelyn commented Feb 23, 2024

hdevalence commented Feb 23, 2024

cratelyn commented Feb 23, 2024

hdevalence commented Feb 23, 2024

cratelyn commented Feb 23, 2024 • edited by hdevalence Loading

cratelyn commented Feb 23, 2024 •

edited by hdevalence

Loading