Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nil ptr on erigon during sync post dencun #9967

Closed
AymericNoel opened this issue Apr 17, 2024 · 7 comments
Closed

Nil ptr on erigon during sync post dencun #9967

AymericNoel opened this issue Apr 17, 2024 · 7 comments
Assignees
Labels
Milestone

Comments

@AymericNoel
Copy link

System information

Erigon version: 2.59.3

OS & Version: Linux; 62Go Ram; 3,6To for disk and 11 CPU

Erigon config :

# Specify a non-default location to store blockchain and other data.
datadir = "{{ erigon_data_dir }}"

# Enable http rpc requests
http = true
"http.addr" = "0.0.0.0"
"http.api" = ["eth","net","engine","erigon","web3","ots","txpool"]
"http.vhosts" = "*"
"http.corsdomain" = "*"

# Enable metrics at port 6060 and exposed
metrics = true
"metrics.addr" = "0.0.0.0"

# Run erigon on specific network. Default: mainnet. Choose "bor-mainnet" for polygon mainnet and "mumbai" for polygon testnet.
chain = "mainnet"

# Enable embedded consensus layer
internalcl= "true"

# Increase txpool size
"txpool.globalslots" = 30000
"txpool.globalbasefeeslots" = 90000
"txpool.accountqueue" = 256
"txpool.globalqueue" = 90000

# Otterscan namespace configuration
"ots.search.max.pagesize" = 5000

# RPC optimization
"db.read.concurrency" = 1300
"rpc.batch.concurrency" = 10

# To not throw errors after release 2.55.0
"db.size.limit" = "8TB"

Consensus Layer: caplin (internal client)

Chain/Network: mainnet

Expected behaviour

After upgrading to 2.59.0, we had to

rm -rf <datadir>/caplin <datadir>/snapshots <datadir>/downloader
make integration
./build/bin/integration stage_headers --reset --datadir=<datadir>
./build/bin/integration stage_snapshots --reset --datadir=<datadir> 

then resync the node.
Node sould be syncing well.

Actual behaviour

After getting to stage 10/12 LogIndex, we had an error that restart node and sync:

[INFO] [04-16|16:54:38.159] Disk storage enabled for ethash DAGs     dir=/opt/node/ethereum/ethash-dags count=2
[INFO] [04-16|16:54:38.159] Initialising Ethereum protocol           network=1
[INFO] [04-16|16:54:06.434] Initialised chain configuration          config="{ChainID: 1, Homestead: 1150000, DAO: 1920000, Tangerine Whistle: 2463000, Spurious Dragon: 2675000, Byzantium: 4370000, Constantinople: 7280000, Petersburg: 7280000, Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Gray Glacier: 15050000, Terminal Total Difficulty: 58750000000000000000000, Merge Netsplit: <nil>, Shanghai: 1681338455, Cancun: 1710338135, Prague: <nil>, Osaka: <nil>, Engine: ethash, NoPruneContracts: map[0x00000000219ab540356cBB839Cbe05303d7705Fa:true]}" genesis=0xd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
[INFO] [04-16|16:54:06.429] [db] open                                lable=chaindata sizeLimit=8TB pageSize=4096
mdbx_setup_dxb:15946 filesize mismatch (expect 2285090373632b/557883392p, have 2642428297216b/645124096p)
[INFO] [04-16|16:53:57.755] Opening Database                         label=chaindata path=/opt/node/ethereum/chaindata
[INFO] [04-16|16:53:57.200] [Downloader] Running with                ipv6-enabled=true ipv4-enabled=true download.rate=16mb upload.rate=4mb
[INFO] [04-16|16:53:57.131] Set global gas cap                       cap=50000000
[INFO] [04-16|16:53:55.024] torrent verbosity                        level=WRN
[INFO] [04-16|16:53:55.024] starting HTTP APIs                       port=8545 APIs=eth,net,engine,erigon,web3,ots,txpool
[INFO] [04-16|16:53:55.022] Maximum peer count                       ETH=100 total=100
[INFO] [04-16|16:53:55.019] Starting Erigon on Ethereum mainnet...
[INFO] [04-16|16:53:55.018] Build info                               git_branch= git_tag= git_commit=
[INFO] [04-16|16:53:55.018] Enabling metrics export to prometheus    path=http://0.0.0.0:6060/debug/metrics/prometheus
[INFO] [04-16|16:53:55.017] logging to file system                   log dir=/opt/node/ethereum/logs file prefix=erigon log level=info json=false
/root/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75 +0x96
created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 266206874
/root/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78 +0x56
golang.org/x/sync/errgroup.(*Group).Go.func1()
/go/src/github.com/ledgerwatch/erigon/turbo/jsonrpc/otterscan_api.go:468 +0x4c
github.com/ledgerwatch/erigon/turbo/jsonrpc.(*OtterscanAPIImpl).traceBlocks.func1()
/go/src/github.com/ledgerwatch/erigon/turbo/jsonrpc/otterscan_search_trace.go:32 +0x1c5
github.com/ledgerwatch/erigon/turbo/jsonrpc.(*OtterscanAPIImpl).searchTraceBlock(0xc0ac335760, {0x31202c8, 0xc10cdab1d0}, {0x5e, 0x6a, 0xd3, 0x96, 0x22, 0xe4, 0x6f, ...}, ...)
/go/src/github.com/ledgerwatch/erigon/turbo/jsonrpc/otterscan_search_trace.go:66 +0x56e
github.com/ledgerwatch/erigon/turbo/jsonrpc.(*OtterscanAPIImpl).traceBlock(0xc0ac335760, {0x3142af0?, 0xc0936b36e0}, {0x31202c8, 0xc10cdab1d0}, 0x129bb6a?, {0x5e, 0x6a, 0xd3, 0x96, ...}, ...)
/go/src/github.com/ledgerwatch/erigon/core/types/block.go:1381
github.com/ledgerwatch/erigon/core/types.(*Block).Time(...)
goroutine 266206875 [running]:
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1d9e2ae]
panic: runtime error: invalid memory address or nil pointer dereference
[INFO] [04-16|16:53:33.649] [p2p] GoodPeers                          eth66=1 eth68=33 eth67=31
[INFO] [04-16|16:53:29.741] [10/12 LogIndex] Progress                number=19611417 alloc=7.4GB sys=25.7GB
[INFO] [04-16|16:53:19.139] [Caplin] Forward Sync                    app=caplin stage=ForwardSync from=8871136 to=8871264
[INFO] [04-16|16:53:08.452] [10/12 LogIndex] Flushed buffer file     name=erigon-sortable-buf-1000274371
[INFO] [04-16|16:52:59.742] [10/12 LogIndex] Progress                number=19602368 alloc=8.7GB sys=25.7GB
[INFO] [04-16|16:52:58.677] P2P                                      app=caplin peers=68
[INFO] [04-16|16:52:30.082] [10/12 LogIndex] Progress                number=19592203 alloc=6.1GB sys=25.7GB
[INFO] [04-16|16:51:59.741] [10/12 LogIndex] Progress                number=19581726 alloc=7.8GB sys=25.7GB
[INFO] [04-16|16:51:58.677] P2P                                      app=caplin peers=67
[INFO] [04-16|16:51:30.370] [10/12 LogIndex] Progress                number=19571774 alloc=11.1GB sys=25.7GB
[INFO] [04-16|16:51:18.073] [txpool] stat                            pending=0 baseFee=0 queued=90000 alloc=10.2GB sys=25.7GB
[INFO] [04-16|16:50:59.743] [10/12 LogIndex] Progress                number=19560387 alloc=8.7GB sys=25.7GB
[INFO] [04-16|16:50:58.677] P2P                                      app=caplin peers=59
[INFO] [04-16|16:50:50.624] [10/12 LogIndex] Flushed buffer file     name=erigon-sortable-buf-2415970171

Don't hesitate to suggest a more appropriate issue title if needed.

@Giulio2002
Copy link
Contributor

@wmitsuda

@Giulio2002
Copy link
Contributor

I think it should go away when the node is done syncing. it is innocuous, it is a crash in the jsonrpc

@AymericNoel
Copy link
Author

Okay @Giulio2002 thanks.

It's weird because before the crash we were at stage 10/12 for sync... now we went back to 5/12...
I'll watch it

@srvint
Copy link

srvint commented Apr 26, 2024

In your config, should chain = mainnet be chain=bor-mainnet?

@AymericNoel
Copy link
Author

We blocked all rpc requests during sync and now that the node is sync, we can make our rpc requests as usual.
Thanks !

And @srvint, we were on chain=mainnet not chain=bor-mainnet

@AymericNoel
Copy link
Author

AymericNoel commented Jul 3, 2024

Hello @wmitsuda , problem is back now whereas the node is sync since at most 2 months. Hope the problem will go away with Erigon V3 (we are waiting for official release)

2024-07-03 10:40:04	panic: runtime error: invalid memory address or nil pointer dereference
2024-07-03 10:40:04	[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1e0e74e]
2024-07-03 10:40:04	goroutine 773955 [running]:
2024-07-03 10:40:04	github.com/ledgerwatch/erigon/core/types.(*Block).Time(...)
2024-07-03 10:40:04		/go/src/github.com/ledgerwatch/erigon/core/types/block.go:1134
2024-07-03 10:40:04	github.com/ledgerwatch/erigon/turbo/jsonrpc.(*OtterscanAPIImpl).traceBlock(0xc0df142460, {0x321e8b0?, 0xc24c4fd560}, {0x31fb148, 0xc1f8f60640}, 0x1317f35?, {0x62, 0x3f, 0x12, 0x85, ...}, ...)
2024-07-03 10:40:04		/go/src/github.com/ledgerwatch/erigon/turbo/jsonrpc/otterscan_search_trace.go:65 +0x50e
2024-07-03 10:40:04	github.com/ledgerwatch/erigon/turbo/jsonrpc.(*OtterscanAPIImpl).searchTraceBlock(0xc0df142460, {0x31fb148, 0xc1f8f60640}, {0x62, 0x3f, 0x12, 0x85, 0x37, 0x0, 0x71, ...}, ...)
2024-07-03 10:40:04		/go/src/github.com/ledgerwatch/erigon/turbo/jsonrpc/otterscan_search_trace.go:31 +0x1c5
2024-07-03 10:40:04	github.com/ledgerwatch/erigon/turbo/jsonrpc.(*OtterscanAPIImpl).traceBlocks.func1()
2024-07-03 10:40:04		/go/src/github.com/ledgerwatch/erigon/turbo/jsonrpc/otterscan_api.go:468 +0x4c
2024-07-03 10:40:04	golang.org/x/sync/errgroup.(*Group).Go.func1()
2024-07-03 10:40:04		/root/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78 +0x56
2024-07-03 10:40:04	created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 773844
2024-07-03 10:40:04		/root/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75 +0x96

@AymericNoel AymericNoel reopened this Jul 3, 2024
@AskAlexSharov AskAlexSharov self-assigned this Jul 19, 2024
@AskAlexSharov AskAlexSharov added this to the 2.60.5-fixes milestone Jul 19, 2024
@AskAlexSharov
Copy link
Collaborator

AskAlexSharov commented Jul 19, 2024

fixed by #11232 and #11233

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants