Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections become unstable after large amount of connections #7243

Closed
benthecarman opened this issue Apr 19, 2024 · 3 comments · Fixed by #7365
Closed

Connections become unstable after large amount of connections #7243

benthecarman opened this issue Apr 19, 2024 · 3 comments · Fixed by #7365
Assignees
Milestone

Comments

@benthecarman
Copy link
Contributor

The Voltage Flow 2.0 has gotten to a pretty big size with thousands of end-user channels. They are not all online at the same time, but it seems they

These connections become unstable and are basically unusable. Mutiny will send a ping message and not receive a pong reply after a few seconds (sometimes up to 10). Have observed similiar behavior with a ldk-sample <> Voltage Flow 2.0 connection.

Even weirder, I have a personal LND node that seems to have normal ping/pongs, however, it always receives the ping and replies with a pong, vs with LDK it always sends the ping and receives the pong (I assume this is just different ping timers between implementations).

We have a signet version of the Flow 2.0 node and it doesn't have these issues so it seems this is a scaling issue with CLN.

We have tried getting help in the discord but received no answers. @niftynei told me to open an issue to try and get support

@vincenzopalazzo
Copy link
Collaborator

What version of cln Voltage Flow 2.0 is running?

We have tried getting help in the discord but received no answers

Sorry for the slowness

@benthecarman
Copy link
Contributor Author

It's in the the latest version

@rustyrussell rustyrussell self-assigned this May 29, 2024
@rustyrussell rustyrussell added this to the v24.08 milestone May 29, 2024
@rustyrussell
Copy link
Contributor

Have a similar report from Boltz, with connectd using a significant amount of CPU on ~1000 peers. My preliminary analysis showed that we're calling poll far too often, and also we should start probably ratelimiting gossip streaming.

I'll post updates here as I work on it. It's kind of awkward to benchmark, so I've got to write some cut-down tooling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants