Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yamux implementation #281

Merged
merged 26 commits into from
May 24, 2023
Merged

Yamux implementation #281

merged 26 commits into from
May 24, 2023

Conversation

ianopolous
Copy link
Contributor

@ianopolous ianopolous commented May 18, 2023

Copy link
Collaborator

@Nashatyrev Nashatyrev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some points to fix and some potential points to discuss:

  • Track window per stream?
  • How to handle sending when window size is 0?

@ianopolous
Copy link
Contributor Author

Thanks for the review @Nashatyrev ! I think that's all fixed now except the locking on empty window size. I'm open to suggestions for that.

@ianopolous
Copy link
Contributor Author

I could just put the pending writes in a queue I guess and process them on the window update thread..

@ianopolous
Copy link
Contributor Author

That would solve the threading, but then it's not propagating back the back pressure up the stack. Any ideas?

@Nashatyrev
Copy link
Collaborator

Nashatyrev commented May 19, 2023

I'm not 100% sure how it is supposed to correctly handle backpressure in Netty.

My understanding:

  • on the application side: backpressure is reported via Channel.isWritable attribute and ChannelInboundHandler.channelWritabilityChanged notification event. The application protocol should respect (in theory) that attribute and suspend writing when needed (isWritable == false).
  • from the Netty side (or may be OS TCP protocol implementation side or may be both) there should be a buffer (likely size bounded) which tolerate backpressure ignorance until some level.
  • I believe when the buffer overflows the connection is terminated

So what we most likely should do is to switch isWritable on the stream Channel when sendWindow switches to zero and back

With respect to buffering I see the following possible options:

  • maintain and limit buffer per stream to tolerate backpressure
    • on buffer overflow abruptly close a stream or connection
  • maintain per-stream buffers but limit the total size per connection
    • on buffer overflow abruptly close connection

@ianopolous
Copy link
Contributor Author

Forgive my netty ignorance, but how do we set isWritable? I can only see a method channel.fireChannelWritabilityChanged()

In terms of DOS protection we should probably do the latter:

* maintain per-stream buffers but limit the total size per connection
  
  * on buffer overflow abruptly close connection

@Nashatyrev
Copy link
Collaborator

Forgive my netty ignorance, but how do we set isWritable? I can only see a method channel.fireChannelWritabilityChanged()

Good question... I don't know 😃

My intuition is that we may need to override the following methods in the AbstractChildChannel:

  • boolean isWritable();
  • long bytesBeforeUnwritable();
  • long bytesBeforeWritable();

Or we need to check out what's going on with the ChannelOutboundBuffer in child channels...

That looks like a topic which needs a deep dive. Thus I would suggest to just skip isWritable attribute support for this PR and file an issue for later investigation.

So we should probably just support buffering with some configurable max size.

In terms of DOS protection we should probably do the latter:

Agree 👍

@ianopolous
Copy link
Contributor Author

There is one other thing I haven't implemented which is not in the spec, but is kind of a libp2p consensus - limiting unACKed streams to 256. But that can also be a future PR I think.

@ianopolous
Copy link
Contributor Author

@Nashatyrev Could you clarify the remaining buffering stuff you'd like done?

@Nashatyrev
Copy link
Collaborator

@Nashatyrev Could you clarify the remaining buffering stuff you'd like done?

I believe the option below should work pretty fine:

  • maintain per-stream buffers but limit the total size per connection
    - on buffer overflow abruptly close connection

per connection max write buffer size limit.
@ianopolous
Copy link
Contributor Author

@Nashatyrev Could you clarify the remaining buffering stuff you'd like done?

I believe the option below should work pretty fine:

  • maintain per-stream buffers but limit the total size per connection
    • on buffer overflow abruptly close connection

I've had a go at that. Let me know what you think.

@Nashatyrev
Copy link
Collaborator

Did some old code refactoring here. Could you please have a look: Peergos#7

Copy link
Collaborator

@Nashatyrev Nashatyrev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's give this PR a go, to not block further work.
Meanwhile I'm planning to add some more generic Mux tests and tests specific to Yamux
Relates to #27

@Nashatyrev Nashatyrev merged commit 8971b31 into libp2p:v1.0.0 May 24, 2023
@ianopolous
Copy link
Contributor Author

@Nashatyrev That sounds great. Some bugs we've found in real world use have suggested some good tests:

  1. Open a connection and then have the recipient open a stream and ping down it.
  2. Do several thousand pings to test exceeding the initial window size.

@Nashatyrev
Copy link
Collaborator

@ianopolous I'm also not sure how Go Away yamux type should work 🤔
From my perspective it should close the connection (not a stream)? If so then I don't understand its purpose - why not just close the connection?

@ianopolous
Copy link
Contributor Author

My only guess is that it's used to communicate the type of error? To maybe not try reconnecting?

@Nashatyrev
Copy link
Collaborator

My only guess is that it's used to communicate the type of error? To maybe not try reconnecting?

Yeah, probably. That makes sense.
In eth2 we have RPC Goodbye call for that, but it's just outside of Libp2p domain

Then I guess we should remove that line

            YamuxType.GO_AWAY -> onRemoteClose(msg.id)

cause it has no sense since streamId == 0 always here

I think there should be some another handling mechanism if any other application needs to utilize that disconnect option.
May be kind of API:

Yamux.goAway(errorCode: Int)
Yamux.getRemoteGoAwayCode(): Int

@Nashatyrev
Copy link
Collaborator

Filed an issue #286 on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants