Skip to content
This repository has been archived by the owner on Apr 4, 2024. It is now read-only.

imp(rpc) Add support for EVM RPC metrics #1303

Closed

Conversation

v-homsi
Copy link
Contributor

@v-homsi v-homsi commented Aug 26, 2022

Closes: #XXX

Description

This PR exposes EVM RPC metrics similar to the ones exposed by go-ethereum.

The metrics correspond to:

  1. Request duration
  2. Count of successful calls
  3. Count of failed calls
  4. The RPC request calls

This PR creates a separate metrics server to serve the metrics for RPC and will be used to expose other EVM/ application metrics in the future.

Based on this discussion, three approaches are identified to exposing the metrics:

  1. Enable the metrics in go-ethereum by passing a hardcoded flag in the CLI --metrics. This option will expose metrics that are not relevant to Ethermint and does not allow for a config file option.
  2. Intercept the request after or before it's served. This approach would require parsing the request a second time (first time it's parsed in go-ethereum) and Ethermint would not be able to know whether it's successful or not.
  3. Migrate the go-ethereum RPC code to Ethermint and make adjustments to the metrics as needed. This is the approach chosen in this PR.

Sample metrics:

# TYPE rpc_duration_all_count counter
rpc_duration_all_count 19954

# TYPE rpc_duration_all summary
rpc_duration_all {quantile="0.5"} 1.1972705e+06
rpc_duration_all {quantile="0.75"} 8.7099585e+06
rpc_duration_all {quantile="0.95"} 8.960733589999996e+07
rpc_duration_all {quantile="0.99"} 1.0390894278e+08
rpc_duration_all {quantile="0.999"} 1.14853960718e+08
rpc_duration_all {quantile="0.9999"} 1.1498575e+08

# TYPE rpc_duration_eth_blockNumber_success_count counter
rpc_duration_eth_blockNumber_success_count 2

# TYPE rpc_duration_eth_blockNumber_success summary
rpc_duration_eth_blockNumber_success {quantile="0.5"} 507
rpc_duration_eth_blockNumber_success {quantile="0.75"} 681
rpc_duration_eth_blockNumber_success {quantile="0.95"} 681
rpc_duration_eth_blockNumber_success {quantile="0.99"} 681
rpc_duration_eth_blockNumber_success {quantile="0.999"} 681
rpc_duration_eth_blockNumber_success {quantile="0.9999"} 681

# TYPE rpc_duration_eth_chainId_success_count counter
rpc_duration_eth_chainId_success_count 34

# TYPE rpc_duration_eth_chainId_success summary
rpc_duration_eth_chainId_success {quantile="0.5"} 1133.5
rpc_duration_eth_chainId_success {quantile="0.75"} 2806.75
rpc_duration_eth_chainId_success {quantile="0.95"} 6268.25
rpc_duration_eth_chainId_success {quantile="0.99"} 6272
rpc_duration_eth_chainId_success {quantile="0.999"} 6272
rpc_duration_eth_chainId_success {quantile="0.9999"} 6272

# TYPE rpc_duration_eth_estimateGas_success_count counter
rpc_duration_eth_estimateGas_success_count 2

# TYPE rpc_duration_eth_estimateGas_success summary
rpc_duration_eth_estimateGas_success {quantile="0.5"} 9705.5
rpc_duration_eth_estimateGas_success {quantile="0.75"} 9715
rpc_duration_eth_estimateGas_success {quantile="0.95"} 9715
rpc_duration_eth_estimateGas_success {quantile="0.99"} 9715
rpc_duration_eth_estimateGas_success {quantile="0.999"} 9715
rpc_duration_eth_estimateGas_success {quantile="0.9999"} 9715

# TYPE rpc_duration_eth_gasPrice_success_count counter
rpc_duration_eth_gasPrice_success_count 3

# TYPE rpc_duration_eth_gasPrice_success summary
rpc_duration_eth_gasPrice_success {quantile="0.5"} 1236
rpc_duration_eth_gasPrice_success {quantile="0.75"} 1362
rpc_duration_eth_gasPrice_success {quantile="0.95"} 1362
rpc_duration_eth_gasPrice_success {quantile="0.99"} 1362
rpc_duration_eth_gasPrice_success {quantile="0.999"} 1362
rpc_duration_eth_gasPrice_success {quantile="0.9999"} 1362

# TYPE rpc_duration_eth_getBalance_success_count counter
rpc_duration_eth_getBalance_success_count 0

# TYPE rpc_duration_eth_getBalance_success summary
rpc_duration_eth_getBalance_success {quantile="0.5"} 0
rpc_duration_eth_getBalance_success {quantile="0.75"} 0
rpc_duration_eth_getBalance_success {quantile="0.95"} 0
rpc_duration_eth_getBalance_success {quantile="0.99"} 0
rpc_duration_eth_getBalance_success {quantile="0.999"} 0
rpc_duration_eth_getBalance_success {quantile="0.9999"} 0

# TYPE rpc_duration_eth_getBlockByNumber_success_count counter
rpc_duration_eth_getBlockByNumber_success_count 3

# TYPE rpc_duration_eth_getBlockByNumber_success summary
rpc_duration_eth_getBlockByNumber_success {quantile="0.5"} 1623
rpc_duration_eth_getBlockByNumber_success {quantile="0.75"} 1784
rpc_duration_eth_getBlockByNumber_success {quantile="0.95"} 1784
rpc_duration_eth_getBlockByNumber_success {quantile="0.99"} 1784
rpc_duration_eth_getBlockByNumber_success {quantile="0.999"} 1784
rpc_duration_eth_getBlockByNumber_success {quantile="0.9999"} 1784

# TYPE rpc_duration_eth_getTransactionByHash_success_count counter
rpc_duration_eth_getTransactionByHash_success_count 99

# TYPE rpc_duration_eth_getTransactionByHash_success summary
rpc_duration_eth_getTransactionByHash_success {quantile="0.5"} 2151
rpc_duration_eth_getTransactionByHash_success {quantile="0.75"} 3531
rpc_duration_eth_getTransactionByHash_success {quantile="0.95"} 5698
rpc_duration_eth_getTransactionByHash_success {quantile="0.99"} 6231
rpc_duration_eth_getTransactionByHash_success {quantile="0.999"} 6231
rpc_duration_eth_getTransactionByHash_success {quantile="0.9999"} 6231

# TYPE rpc_duration_eth_getTransactionCount_success_count counter
rpc_duration_eth_getTransactionCount_success_count 1136

# TYPE rpc_duration_eth_getTransactionCount_success summary
rpc_duration_eth_getTransactionCount_success {quantile="0.5"} 4439
rpc_duration_eth_getTransactionCount_success {quantile="0.75"} 6791.25
rpc_duration_eth_getTransactionCount_success {quantile="0.95"} 10646.649999999996
rpc_duration_eth_getTransactionCount_success {quantile="0.99"} 12842.739999999998
rpc_duration_eth_getTransactionCount_success {quantile="0.999"} 14244.807999999999
rpc_duration_eth_getTransactionCount_success {quantile="0.9999"} 14252

# TYPE rpc_duration_eth_getTransactionReceipt_success_count counter
rpc_duration_eth_getTransactionReceipt_success_count 854

# TYPE rpc_duration_eth_getTransactionReceipt_success summary
rpc_duration_eth_getTransactionReceipt_success {quantile="0.5"} 95
rpc_duration_eth_getTransactionReceipt_success {quantile="0.75"} 434.25
rpc_duration_eth_getTransactionReceipt_success {quantile="0.95"} 18385.749999999978
rpc_duration_eth_getTransactionReceipt_success {quantile="0.99"} 36295.9
rpc_duration_eth_getTransactionReceipt_success {quantile="0.999"} 42889
rpc_duration_eth_getTransactionReceipt_success {quantile="0.9999"} 42889

# TYPE rpc_duration_eth_sendRawTransaction_success_count counter
rpc_duration_eth_sendRawTransaction_success_count 5

# TYPE rpc_duration_eth_sendRawTransaction_success summary
rpc_duration_eth_sendRawTransaction_success {quantile="0.5"} 1169
rpc_duration_eth_sendRawTransaction_success {quantile="0.75"} 1505.5
rpc_duration_eth_sendRawTransaction_success {quantile="0.95"} 1556
rpc_duration_eth_sendRawTransaction_success {quantile="0.99"} 1556
rpc_duration_eth_sendRawTransaction_success {quantile="0.999"} 1556
rpc_duration_eth_sendRawTransaction_success {quantile="0.9999"} 1556

# TYPE rpc_failure gauge
rpc_failure 0

# TYPE rpc_requests gauge
rpc_requests 19954

# TYPE rpc_success gauge
rpc_success 19954


For contributor use:

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer

For admin use:

  • Added appropriate labels to PR (ex. WIP, R4R, docs, etc)
  • Reviewers assigned
  • Squashed all commits, uses message "Merge pull request #XYZ: [title]" (coding standards)

@v-homsi
Copy link
Contributor Author

v-homsi commented Sep 5, 2022

@facs95 @fedekunze @danburck

After digging deep into go-ethereum code and investigating the different alternatives, here are the different approaches to support metrics on the EVM rpc:

Approach 1:

go-ethereum already has support for metrics including setting up a separate metrics server and exposing metrics. To use that, ethermint can call:

  • Call Exp.Setup(address string) from here
  • pass --metrics in the CLI (hardcoded) as per this

Pros:
Easy to integrate in our code and will shift most of the heavy lifting to go-ethereum

Cons:
All the metrics from go-ethereum will be exposed. Some of which are not relevant to ethermint such as consensus related metrics
Need to add --metrics to our start command and I don't think ethermint could support a config file option (app.toml) for this

Approach 2:

Create an http middleware that gets called after the request has been processed by go-ethereum which needs to:

  • Parse the json message from the http request and expose it as a metric

Pros:
Approach is independent of go-ethereum

Cons:
Approach results in parsing the message twice as go-ethereum parses it the first time and ethermint will need to parse again
Need to copy a lot of the code that parses the http request from go-ethereum to ethermint repo
The middleware would be unable to know if the request was successful or not. It could only know how long it took to execute the request and what method was called

Approach 3:

Copy the implementation of ServeHTTP from go-ethereum to ethermint which includes the metrics that are only relevant for the RPC specified here

Pros:
Complete independence of go-ethereum in RPC
Only relevant metrics are collected and exposed
Cleanest approach that produces the most optimal results (no unnecessary metrics and can track if request failed or succeeded)

Cons:
Need to create a lot of tests to make sure functionality is maintained
Harder to stay synced with go-ethereum upstream
Bigger chunk of code to maintain

What do you think?

@fedekunze fedekunze changed the title Add support for EMV http metrics Add support for EVM http metrics Sep 8, 2022
@v-homsi v-homsi force-pushed the feature/expose_evm_http_server_metrics branch from 47b3303 to 77dae19 Compare September 19, 2022 03:40
@codecov
Copy link

codecov bot commented Sep 19, 2022

Codecov Report

Merging #1303 (3491e5a) into main (07202c1) will increase coverage by 0.86%.
The diff coverage is 62.27%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1303      +/-   ##
==========================================
+ Coverage   54.78%   55.65%   +0.86%     
==========================================
  Files         108      124      +16     
  Lines       10015    11342    +1327     
==========================================
+ Hits         5487     6312     +825     
- Misses       4255     4705     +450     
- Partials      273      325      +52     
Impacted Files Coverage Δ
rpc/client/metrics.go 0.00% <0.00%> (ø)
rpc/client/stdio.go 0.00% <0.00%> (ø)
rpc/types/callback.go 0.00% <0.00%> (ø)
rpc/types/decimal.go 0.00% <0.00%> (ø)
rpc/types/errors.go 0.00% <0.00%> (ø)
rpc/types/block.go 31.92% <8.82%> (-18.54%) ⬇️
rpc/client/client_opt.go 20.58% <20.58%> (ø)
rpc/types/subscription.go 26.47% <26.47%> (ø)
rpc/client/ipc.go 44.44% <44.44%> (ø)
server/config/config.go 22.85% <50.00%> (+0.63%) ⬆️
... and 9 more

@v-homsi v-homsi marked this pull request as ready for review September 19, 2022 03:42
@v-homsi v-homsi requested a review from a team as a code owner September 19, 2022 03:42
@v-homsi v-homsi requested review from 0a1c and ramacarlucho and removed request for a team September 19, 2022 03:42
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 48 potential problems in the proposed changes. Check the Files changed tab for more details.

@v-homsi v-homsi changed the title Add support for EVM http metrics Add support for EVM RPC metrics Sep 19, 2022
Copy link
Contributor

@fedekunze fedekunze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@v-homsi thanks! Let's add the new files into client/, server/, and the other existing folders. We also need to add a new API breaking change entry in the Changelog.

We should address the linter warnings and gosec errors as well

@0a1c
Copy link
Contributor

0a1c commented Sep 19, 2022

Need to add --metrics to our start command and I don't think ethermint could support a config file option (app.toml) for this

@v-homsi
Why can't we add a config option for this? It seems to me that this approach would be the most efficient long-term, especially if we can simply drop irrelevant metrics.

@v-homsi
Copy link
Contributor Author

v-homsi commented Sep 19, 2022

Need to add --metrics to our start command and I don't think ethermint could support a config file option (app.toml) for this

@v-homsi Why can't we add a config option for this? It seems to me that this approach would be the most efficient long-term, especially if we can simply drop irrelevant metrics.

go-ethereum expects the --metrics to be passed in the command line which is hardcoded in init method in here. if we wanted to support a config file option for it, we would need to parse the config and somehow find a way to inject --metrics into the command line before the init method of go-ethereum is called.

Another drawback of this approach is that we would be exposing metrics that are not relevant and would never change (at the mercy of what go-ethereum exposes).

@v-homsi v-homsi marked this pull request as draft September 20, 2022 02:01
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 31 potential problems in the proposed changes. Check the Files changed tab for more details.

@v-homsi v-homsi force-pushed the feature/expose_evm_http_server_metrics branch from 2e46694 to d757936 Compare September 29, 2022 03:51
rpc/client.go Fixed Show fixed Hide fixed
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 31 potential problems in the proposed changes. Check the Files changed tab for more details.

@v-homsi v-homsi force-pushed the feature/expose_evm_http_server_metrics branch from d757936 to 3d6dc8d Compare September 29, 2022 03:59
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 31 potential problems in the proposed changes. Check the Files changed tab for more details.

rpc/websocket.go Outdated
return nil, err
}
}
conn, resp, err := dialer.DialContext(ctx, dialURL, header) //nolint:bodyclose // not fixed as imported from go-ethereum
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fedekunze @facs95

Linter complained about this line that conn needs to be closed like:

defer conn.Close()

However, as it's a websocket connection, I think adding the defer conn.Close() might close the connection while websocket is still open. What do you think?

@v-homsi v-homsi changed the title Add support for EVM RPC metrics imp(rpc) Add support for EVM RPC metrics Sep 29, 2022
@v-homsi
Copy link
Contributor Author

v-homsi commented Sep 29, 2022

@v-homsi thanks! Let's add the new files into client/, server/, and the other existing folders. We also need to add a new API breaking change entry in the Changelog.

We should address the linter warnings and gosec errors as well

The folder I migrated the files to ( rpc ) already exists and I think it might be better to separate by feature rather than functionality (All RPC stuff goes under rpc folder). What do you think?

msg := &rpcCodec.JsonrpcMessage{Version: types.Vsn, ID: c.nextID(), Method: method}
if paramsIn != nil { // prevent sending "params":null
var err error
if msg.Params, err = json.Marshal(paramsIn); err != nil {

Check warning

Code scanning / Semgrep

Should `$X` be modified when an error could be returned?

Should `msg` be modified when an error could be returned?
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gosec found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

@v-homsi v-homsi force-pushed the feature/expose_evm_http_server_metrics branch 3 times, most recently from 073bf0e to 3aa8d61 Compare October 6, 2022 05:08
- Import http and handlers from go-ethereum RPC to ethermint RPC
- Support enabling/disabling RPC metrics and metric server
- Fix linter issues
- Add entry to CHANGELOG
- Update gomod2nix.toml
- Refactor rpc code into different packages
- Update test timeout in github action
@v-homsi v-homsi force-pushed the feature/expose_evm_http_server_metrics branch 2 times, most recently from 80d6aba to d1a820b Compare October 6, 2022 05:49
@v-homsi v-homsi force-pushed the feature/expose_evm_http_server_metrics branch from d1a820b to 3491e5a Compare October 6, 2022 12:51
@v-homsi v-homsi marked this pull request as ready for review October 6, 2022 13:01
@v-homsi
Copy link
Contributor Author

v-homsi commented Oct 7, 2022

Converting this PR to draft PR again. I came across https://github.com/ledgerwatch/erigon and I would like to investigate potentially replacing go-ethereum with erigon instead of migrating the code as erigon has different metrics mechanism (the original motivation of the PR).
Let me know if you have any thoughts on this.
Feel free to keep reviewing the PR as I'm waiting for feedback.

@v-homsi v-homsi marked this pull request as draft October 7, 2022 00:48
@v-homsi v-homsi marked this pull request as ready for review October 10, 2022 15:19
Copy link
Contributor

@danburck danburck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@v-homsi Can you add in the PR description which approach you took? Also is there an overview of what metrics are enabled with this and how they are different from the metrics listed here?

rpcServer.WithMetrics(registry)
metricsErrChan := setupMetricsServer(ctx, config, registry)

select {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment here explaining what you are waiting for would be helpful

@@ -43,14 +41,23 @@ const (
apiVersion = "1.0"
)

// API describes the set of methods offered over the RPC interface
type API struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would rather this API struct to be an extension of the go-ethereum one given that making sure that we stay align with their API is key for different integrations.

EthPendingBlockNumber = BlockNumber(-2)
EthLatestBlockNumber = BlockNumber(-1)
EthEarliestBlockNumber = BlockNumber(0)
EthSafeBlockNumber = BlockNumber(-4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not relevant to our chain as we have instant finality. So Safe Finalized and Latest should be BlockNumber(-1).

default:
// check if the input is a block hash
if len(input) == 66 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were the comments deleted?


func (bnh *BlockNumberOrHash) String() string {
if bnh.BlockNumber != nil {
return strconv.Itoa(int(*bnh.BlockNumber))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it might be redundant as it is check in the decoding but it might be better to play it safe here and check that blockNumber does not overflow the integer type

@yihuang
Copy link
Contributor

yihuang commented Oct 11, 2022

Need to add --metrics to our start command and I don't think ethermint could support a config file option (app.toml) for this

@v-homsi Why can't we add a config option for this? It seems to me that this approach would be the most efficient long-term, especially if we can simply drop irrelevant metrics.

go-ethereum expects the --metrics to be passed in the command line which is hardcoded in init method in here. if we wanted to support a config file option for it, we would need to parse the config and somehow find a way to inject --metrics into the command line before the init method of go-ethereum is called.

Another drawback of this approach is that we would be exposing metrics that are not relevant and would never change (at the mercy of what go-ethereum exposes).

the Enable is a public variable, is it possible that we set it explicitly based on the config value?

@v-homsi
Copy link
Contributor Author

v-homsi commented Oct 11, 2022

Need to add --metrics to our start command and I don't think ethermint could support a config file option (app.toml) for this

@v-homsi Why can't we add a config option for this? It seems to me that this approach would be the most efficient long-term, especially if we can simply drop irrelevant metrics.

go-ethereum expects the --metrics to be passed in the command line which is hardcoded in init method in here. if we wanted to support a config file option for it, we would need to parse the config and somehow find a way to inject --metrics into the command line before the init method of go-ethereum is called.
Another drawback of this approach is that we would be exposing metrics that are not relevant and would never change (at the mercy of what go-ethereum exposes).

the Enable is a public variable, is it possible that we set it explicitly based on the config value?

Thanks for taking a look!

The problem with this is that there's no guarantee that setting Enabled to true be executed before some of the code in go-ethereum that relies on it. In case of the RPC metrics, these global metrics will use the NilGauge instead of StandardGauge because the code in here is dependent on the Enabled variable. This results in the metric to always be the default value (0, in the case of RPC).

Interestingly, this metric works fine because it gets registered after the Enabled flag is set and it's not a global variable.

I created a PR in my fork with the change you mentioned. Feel free to try it out yourself.

@v-homsi
Copy link
Contributor Author

v-homsi commented Oct 13, 2022

After discussions with @facs95 regarding this PR, he suggested to go with approach 1 and try to upstream changes to go-ethereum suitable for our use cases instead of migrating the entire RPC code base.

I created this PR for it

@facs95
Copy link
Contributor

facs95 commented Oct 14, 2022

Replaced by - #1378

@facs95 facs95 closed this Oct 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
No open projects
Status: Canceled
Development

Successfully merging this pull request may close these issues.

6 participants