Evaluate Tendermock #2173

mmulji-ic · 2023-02-10T16:03:47Z

Summary

Use Cases

Evaluate the Tendermock testing tool based on the following testing use cases:
- Testing single-chain applications under simple behaviour:
  - Start a chain with a single validator
  - Send transactions, and make sure they are included
  - Query application state
- Testing single-chain applications under complex behaviour
  - Start a chain with multiple validators
  - Cause a validator to double-sign
  - Cause a validator to go down (and thus get slashed)
  - Run several versions of the application in parallel to check compatibility, in particular upgrading one app instance while running and leaving other instances unchanged
- Testing multi-chain applications
  - Starting a provider chain and several consumer chains, each with multiple validators
  - Starting a relayer
  - Establishing an IBC connection between the provider and many consumers
  - Sending transactions and querying states on all chains
  - Run different ICS versions between consumer and provider
  - Run consumer upgrades
  - Run consumer chain removals, either due to proposals or IBC channel timeouts
Outline missing functionality that is required by the above usecases.

Single Chain Simple

Implement necessary missing RPC endpoints (see https://docs.cometbft.com/v0.34/rpc/): block, validators
Store old blocks and validator sets
Keep track of validator updates
- Need functionality to convert PubKeys to addresses, like https://github.com/cometbft/cometbft/blob/dedfda495300b87a748bbf470f17d9c7cabbaddd/crypto/ed25519/ed25519.go#L156
Properly implement returns for broacast modes sync, commit, async

Single Chain Complex

Connect Tendermock to multiple applications
Add API for functionalities beyond tendermint rpc endpoints:
cause double-sign
force downtime

Multi Chain

The other 2 use cases
Outline a steps, on how, if appropriate, Tendermock can be integrated into the current Gaia / ICS test suite.
- Start validators with flags --transport=grpc --with-tendermint=false
- Skip starting the query node
- Start Tendermock, passing it the listen addresses of all the validator nodes, and having it listen on tendermock_ip:tendermock_port
- When querying or submitting transactions, use the flag --node tcp://tendermock_ip:tendermock_port

Informal Systems variant: https://github.com/informalsystems/tendermock
Another developed at EPFL: https://github.com/CharlyCst/tendermock

Problem Definition

Mocking out the consensus layer means we can skip
waiting for the network communication in the potentially multiple rounds
needed for CometBFT to come to consensus.
Thus, this can make the existing tests which are running full nodes faster.

Looking further, it can also enable writing tests with new functionality,
as Tendermock can mock arbitrary timestamps coming from Comet,
and we can finely control which validators sign and don't sign which blocks.

Closing Criterion

A decision is made regarding whether or not to use Tendermock in the QA process.

For Admin Use

Not duplicate issue
Appropriate labels applied
Appropriate contributors tagged
Contributor assigned/self-assigned
Is a spike necessary to map out how the issue should be approached?

p-offtermatt · 2023-04-05T11:45:21Z

@smarshall-spitzbart @sainoe I added some use cases for Tendermock. Please double-check, and also add any other (or more concrete) use cases you have in mind

shaspitz · 2023-04-11T21:50:04Z

These use cases are solid, one critical usecase under the multi-chain testing section would be testing upgrades of consumers. Also, testing incompatibilities / backwards compatibility of ICS versions between consumer and provider

sainoe · 2023-04-12T07:24:04Z

This use-cases list looks pretty complete! To add to Shawn's comment, I could see Gaia upgrade tests using Tendermock. Also on the Testing multi-chain applications section, I'd consider the consumer chain removals due to governance proposals or channel conditions use-cases.

p-offtermatt · 2023-04-12T13:34:49Z

Outcome of the evaluation

I worked on a prototype for the integration of Tendermock into the e2e tests on https://github.com/cosmos/interchain-security/tree/ph/integrate-tendermock.

Major changes to run Tendermock are in https://github.com/cosmos/interchain-security/blob/ph/integrate-tendermock/tests/integration/testnet-scripts/start-chain.sh
https://github.com/cosmos/interchain-security/blob/ph/integrate-tendermock/Dockerfile

I have a simple example working which just performs 200 token transfers with 2 validators.
It runs in 1m31.579475667s with Tendermock,
and 2m19.789677167s with CometBFT underneath.

With 3 validators:
Tendermock: 1m39.053192667s
CometBFT: 2m38.104231666s

That seems like a small but potentially significant boost, especially since it suggests that more validators are scaling better with Tendermock. This is even without considering that Tendermock can also be used to simulate environments with many more validators without the cost of actually running more instances of the application. Also, I think using a different language than Python might help speed this up.

The real e2e tests are not running yet, because Tendermock does not properly support all RPC endpoints of Tendermint yet.

Open questions are:

Can Tendermock be sped up? My feeling is that somewhere something is slower than it needs to be. I have experimented a bit, and it seems that Tendermock, from receipt of the broadcast request to finish, takes only about 0.1s per block. It seems the slowdown comes before that, maybe the logic for signing and preparing transactions in the SDK
How long would integrating Tendermock properly take? My estimate is to get it running with the full suite of existing e2e tests, it seems likely to be upwards of 10 days to integrate, but would need to take some time to split the work up properly into smaller chunks to estimate
Is integrating Tendermock worth it? My feeling is yes with an e2e suite that we anticipate using for a longer time, but probably not worth doing it properly with the current setup if we are looking towards a refactor anyways
Should Tendermock better be written as a Go library? My feeling is it might make sense just to have it be faster and potentially easier to maintain, and it might also be easier to pitch it to other teams for their tests. Also, can reuse a lot of code from CometBFT, i.e. probably much of the state update logic

Investigating a bit more, I tried to implement some of the missing functionality for Tendermock. I think it makes little sense to have it in Python, since this would mean reimplementing a lot of the logic from cometBFT with respect to state updates, cryptography, validator set updates, etc, for almost no benefit. Implementing this in Go would take advantage of the existing codebase.

p-offtermatt · 2023-04-18T14:39:46Z

I wrote this issue up in Google Docs so it is a bit easier to review and better summarized.
A link to the writeup is here:
https://docs.google.com/document/d/1IcHLLu-Z6H8CGsCrEGwDC55TYv93RgKTmY9c5xJra0k/edit?usp=sharing

Feel free to leave comments there.

mmulji-ic assigned shaspitz Feb 10, 2023

mpoke unassigned shaspitz Feb 23, 2023

mpoke assigned p-offtermatt Mar 6, 2023

mpoke added general questions scope: testing Code review, testing, making sure the code is following the specification. labels Apr 12, 2023

mpoke mentioned this issue Apr 14, 2023

Improve Gaia QA process #2406

Closed

mpoke closed this as completed Apr 26, 2023

MSalopek mentioned this issue May 15, 2023

Use verbose in ci to prioritize humans over machines cosmos/interchain-security#957

Closed

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate Tendermock #2173

Evaluate Tendermock #2173

mmulji-ic commented Feb 10, 2023 •

edited by p-offtermatt

Loading

p-offtermatt commented Apr 5, 2023 •

edited

Loading

shaspitz commented Apr 11, 2023

sainoe commented Apr 12, 2023

p-offtermatt commented Apr 12, 2023 •

edited

Loading

p-offtermatt commented Apr 18, 2023

Evaluate Tendermock #2173

Evaluate Tendermock #2173

Comments

mmulji-ic commented Feb 10, 2023 • edited by p-offtermatt Loading

Summary

Problem Definition

Closing Criterion

For Admin Use

p-offtermatt commented Apr 5, 2023 • edited Loading

shaspitz commented Apr 11, 2023

sainoe commented Apr 12, 2023

p-offtermatt commented Apr 12, 2023 • edited Loading

Outcome of the evaluation

p-offtermatt commented Apr 18, 2023

mmulji-ic commented Feb 10, 2023 •

edited by p-offtermatt

Loading

p-offtermatt commented Apr 5, 2023 •

edited

Loading

p-offtermatt commented Apr 12, 2023 •

edited

Loading