Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geth port isn't freed after stopping a local testnet #5382

Closed
danielrachi1 opened this issue Mar 8, 2024 · 3 comments
Closed

Geth port isn't freed after stopping a local testnet #5382

danielrachi1 opened this issue Mar 8, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@danielrachi1
Copy link
Contributor

danielrachi1 commented Mar 8, 2024

Description

stop_local_testnet.sh and clean.sh fail to properly stop all geth instances. Causing the port to remain busy after restarting the local testnet, which prevents new testnets to run until all geth processes are killed manually.

Version

Branch: unstable
Version: v5.0.0-f93844e

Present Behaviour

  1. Go to scripts/local_testnet/ and run ./start_local_testnet.sh genesis.json
  2. Wait for the script to finish and run tail -f ~/.lighthouse/local-testnet/testnet/geth_1.log
  3. After a small wait, the testnet will start advancing:
    image
  4. Re-start the testnet by running ./start_local_testnet.sh genesis.json again. (stop_local_testnet.sh and clean.sh) are part of this script.
  5. Wait for the script to finish and run tail -f ~/.lighthouse/local-testnet/testnet/geth_1.log
  6. The geth instances will fail to start because their ports will be busy:
    image
  7. Run ./stop_local_testnet.sh && ./clean.sh
  8. Run htop and search for geth. A bunch of geth processes will still be running:
    image
  9. Kill the geth processes manually.
  10. Run ./start_local_testnet.sh genesis.json again. Now the testnet will start with no issues:
    image

Expected Behaviour

In step 4 a new testnet should be started with no issues. Also, ./stop_local_testnet.sh && ./clean.sh should kill all geth processes.

@chong-he
Copy link
Member

chong-he commented Mar 12, 2024

In Step 4, when you restart the testnet, you will need to stop the testnet properly: ./stop_local_testnet.sh and then run ./start_local_testnet.sh genesis.json. Otherwise it is like you are trying to start another instance of testnet (geth and lighthouse which are already running) , it will fail as the error you show in Step 6.

If you run ./stop_local_testnet.sh, all processes would be stopped properly. This can then be followed by a ./start_local_testnet.sh genesis.json.

Edit: let me look further as the ./start_local_testnet.sh genesis.json contains the stopping as well. I will get back later

Update: This is due to the change in geth.sh in #5137 that modifies the script to enable logging for geth. This may not be necessary as we already save the logs for geth in ./start_local_testnet.sh genesis.json . The logs are in ~/.lighthouse/local-testnet/testnet/geth_1.log (and geth_2 etc)

If you remove this line:

2>&1 | tee $data_dir/geth.log

and the backlash \ in the previous line (line 53), then the script will work.

Hope this clarifies

@chong-he chong-he added the bug Something isn't working label Mar 12, 2024
@danielrachi1
Copy link
Contributor Author

That fixed it, thanks!

Just updated #5383

@chong-he
Copy link
Member

Closed by #5383

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants