-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: improvements to big block tests #3493
feat: improvements to big block tests #3493
Conversation
Signed-off-by: Smuu <[email protected]>
err = n.Instance.Commit() | ||
if err != nil { | ||
return fmt.Errorf("committing instance: %w", err) | ||
} | ||
|
||
if err = n.Instance.AddFolder(nodeDir, remoteRootDir, "10001:10001"); err != nil { | ||
return fmt.Errorf("copying over node %s directory: %w", n.Name, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding files after commit speeds up the set-up process
go.mod
Outdated
|
||
require ( | ||
cosmossdk.io/errors v1.0.1 | ||
cosmossdk.io/math v1.3.0 | ||
github.com/celestiaorg/blobstream-contracts/v3 v3.1.0 | ||
github.com/celestiaorg/go-square v1.0.1 | ||
github.com/celestiaorg/go-square/merkle v0.0.0-20240117232118-fd78256df076 | ||
github.com/celestiaorg/knuu v0.13.2 | ||
github.com/celestiaorg/knuu v0.13.3-rc.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the latest fixes and improvements
// copy over the keyring directory to the txsim instance | ||
err = txsim.Instance.AddFolder(txsimKeyringDir, txsimRootDir, "10001:10001") | ||
err = txsim.Instance.Commit() | ||
if err != nil { | ||
log.Err(err). | ||
Str("directory", txsimKeyringDir). | ||
Str("name", name). | ||
Msg("error adding keyring dir to txsim") | ||
Msg("error committing txsim") | ||
return err | ||
} | ||
err = txsim.Instance.Commit() | ||
|
||
// copy over the keyring directory to the txsim instance | ||
err = txsim.Instance.AddFolder(txsimKeyringDir, txsimRootDir, "10001:10001") | ||
if err != nil { | ||
log.Err(err). | ||
Str("directory", txsimKeyringDir). | ||
Str("name", name). | ||
Msg("error committing txsim") | ||
Msg("error adding keyring dir to txsim") | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding files after commit speeds up the set-up process
test/e2e/testnet/testnet.go
Outdated
// start genesis nodes asynchronously | ||
for _, node := range genesisNodes { | ||
err := node.Start() | ||
err := node.Instance.StartWithoutWait() | ||
if err != nil { | ||
return fmt.Errorf("node %s failed to start: %w", node.Name, err) | ||
} | ||
} | ||
// wait for instances to be running | ||
for _, node := range genesisNodes { | ||
client, err := node.Client() | ||
err := node.Instance.WaitInstanceIsRunning() | ||
if err != nil { | ||
return fmt.Errorf("failed to initialize node %s: %w", node.Name, err) | ||
return fmt.Errorf("node %s failed to start: %w", node.Name, err) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can start all nodes in an async way and then wait for all nodes to be started.
I contrast to start and wait for them one-by-one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, it is very useful!
@smuu can you please provide a PR description? Did you mean to tag @staheri14 for review b/c this PR merges to her branch. |
Thanks @smuu for the PR!
AFAIK, The purpose of this PR is to provide fixes for some issues that occur when using the latest release candidate of knuu for the big block tests (and e2e tests in general). That is why it is based off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Left some comments to understand more.
Also, if we want these changes, maybe it makes sense to make an official knuu release instead of depending on an RC
test/e2e/testnet/node.go
Outdated
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) { | ||
return 0, err | ||
} | ||
|
||
return 0, fmt.Errorf("error getting height: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit]
What's the difference between the two returns? one is adding the string "error getting height:"
, but does it matter? is it being parsed somewhere?
resultData, ok := result["result"].(map[string]interface{}) | ||
if !ok { | ||
return 0, fmt.Errorf("error getting result from status") | ||
} | ||
syncInfo, ok := resultData["sync_info"].(map[string]interface{}) | ||
if !ok { | ||
return 0, fmt.Errorf("error getting sync info from status") | ||
} | ||
latestBlockHeight, ok := syncInfo["latest_block_height"].(string) | ||
if !ok { | ||
return 0, fmt.Errorf("error getting latest block height from sync info") | ||
} | ||
latestBlockHeightInt, err := strconv.ParseInt(latestBlockHeight, 10, 64) | ||
if err != nil { | ||
return 0, fmt.Errorf("error converting latest block height to int: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of these gymnastics, why not parse the whole response to ResultStatus
from https://github.com/celestiaorg/celestia-core/blob/a281e871c70f4c8bd4f6ab7c7e39e28c715d65b2/rpc/core/types/responses.go#L123-L128, and get the field you want from it.
type JSONRPCError struct { | ||
Code int | ||
Message string | ||
Data string | ||
} | ||
|
||
func (e *JSONRPCError) Error() string { | ||
return fmt.Sprintf("JSONRPC Error - Code: %d, Message: %s, Data: %s", e.Code, e.Message, e.Data) | ||
} | ||
|
||
func getStatus(executor *knuu.Executor, app *knuu.Instance) (string, error) { | ||
nodeIP, err := app.GetIP() | ||
if err != nil { | ||
return "", fmt.Errorf("error getting node ip: %w", err) | ||
} | ||
|
||
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute) | ||
defer cancel() | ||
status, err := executor.ExecuteCommandWithContext(ctx, "wget", "-q", "-O", "-", fmt.Sprintf("%s:26657/status", nodeIP)) | ||
if err != nil { | ||
return "", fmt.Errorf("error executing command: %w", err) | ||
} | ||
return status, nil | ||
} | ||
|
||
func latestBlockHeightFromStatus(status string) (int64, error) { | ||
var result map[string]interface{} | ||
err := json.Unmarshal([]byte(status), &result) | ||
if err != nil { | ||
return 0, fmt.Errorf("error unmarshalling status: %w", err) | ||
} | ||
|
||
if errorField, ok := result["error"]; ok { | ||
errorData, ok := errorField.(map[string]interface{}) | ||
if !ok { | ||
return 0, fmt.Errorf("error field exists but is not a map[string]interface{}") | ||
} | ||
jsonError := &JSONRPCError{} | ||
if errorCode, ok := errorData["code"].(float64); ok { | ||
jsonError.Code = int(errorCode) | ||
} | ||
if errorMessage, ok := errorData["message"].(string); ok { | ||
jsonError.Message = errorMessage | ||
} | ||
if errorData, ok := errorData["data"].(string); ok { | ||
jsonError.Data = errorData | ||
} | ||
return 0, jsonError | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar, instead of defining these here and doing all of the heavy lifting, you could use these existing types: https://github.com/celestiaorg/celestia-core/blob/a281e871c70f4c8bd4f6ab7c7e39e28c715d65b2/rpc/jsonrpc/types/types.go#L140-L159
And, you could simply parse the response from the command using:
recv := new(types.RPCResponse)
err = json.Unmarshal(status, recv)
err = txsim.Instance.Commit() | ||
|
||
// copy over the keyring directory to the txsim instance | ||
err = txsim.Instance.AddFolder(txsimKeyringDir, txsimRootDir, "10001:10001") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[microscopic nit]
it would be nice to define some constants for UID and GID and explain in their docs that these are the permissions given in the Celestia-app docker images
…u/improvements-to-big-block-tests
Signed-off-by: Smuu <[email protected]>
Signed-off-by: Smuu <[email protected]>
Signed-off-by: Smuu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're making some of these improvements in my pr already #3487
…u/improvements-to-big-block-tests
Signed-off-by: Smuu <[email protected]>
Signed-off-by: Smuu <[email protected]>
Signed-off-by: Smuu <[email protected]>
Signed-off-by: Smuu <[email protected]>
Signed-off-by: Smuu <[email protected]>
Signed-off-by: Smuu <[email protected]>
Signed-off-by: Smuu <[email protected]>
Signed-off-by: Smuu <[email protected]>
log.Println("Reading blockchain") | ||
blockchain, err := testnode.ReadBlockHeights(context.Background(), | ||
b.Node(0).AddressRPC(), 1, 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before, the chain was only read until height 5
. As the txsim
instances start after the validators, the first blocks do not contain any txs. Now, we are reading the whole blockchain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Samuel, it was intentional to reduce the height to 5 as otherwise it would take a long time to read the blockchain specially for large size blocks, and would fail from one run to another. But if it works for the entire blockchain, that would be even better, thanks!
for _, node := range genesisNodes { | ||
err := node.Start() | ||
err := node.StartAsync() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to speed up the start-up phase, we can start all validator instances async and then wait for them being started.
for _, node := range t.nodes { | ||
if node.Instance.IsInState(knuu.Started) { | ||
if err := node.Instance.Stop(); err != nil { | ||
log.Err(err). | ||
Str("name", node.Name). | ||
Msg("node failed to stop") | ||
continue | ||
} | ||
if err := node.Instance.WaitInstanceIsStopped(); err != nil { | ||
log.Err(err). | ||
Str("name", node.Name). | ||
Msg("node failed to stop") | ||
continue | ||
} | ||
} | ||
if node.Instance.IsInState(knuu.Started, knuu.Stopped) { | ||
err := node.Instance.Destroy() | ||
if err != nil { | ||
log.Err(err). | ||
Str("name", node.Name). | ||
Msg("node failed to cleanup") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to stop the instance before destroying it; we can simply destroy it.
ignore this, i reverted those improvements. |
3842cf1
into
celestiaorg:sanaz/big-block-test
@smuu I apologize for any confusion. I did not intend to unilaterally close the PR. My goal was to merge the changes in this branch into sanaz/big-block-test as you requested. I'm not sure how the PR ended up getting closed. |
Closes #3480 This PR backports many optimizations introduced in the big block tests via #3493 to the `main` branch. Will be ready for review after merging #3514 --------- Co-authored-by: Evan Forbes <[email protected]>
Closes #3480 This PR backports many optimizations introduced in the big block tests via celestiaorg/celestia-app#3493 to the `main` branch. Will be ready for review after merging celestiaorg/celestia-app#3514 --------- Co-authored-by: Evan Forbes <[email protected]>
Overview
This PR is for
sanaz/big-block-test
to debug the issues and to speed up the testing process.@staheri14