Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Netty to version 4.1.69 #200

Merged
merged 2 commits into from
Oct 19, 2021

Conversation

Nashatyrev
Copy link
Collaborator

  • Upgrade Netty to version 4.1.69
  • Fix the failing test

Some background discussion on the test case failure caused by version upgrade:

@regio

We need to update the io.netty library to the latest version 4.1.68.Final in teku, however, this leads to cascade this update to libp2p as well. When we updated the libp2p to netty 4.1.68.Final, the test msgExceedsLimit_concatenatedWithValidMessage
in the LimitedProtobufVarint32FrameDecoderTest Class is failing.
After some investigation, we noticed that the problem happens when it writes a large message in the channel. It ends duplicating the message in the inboundMessages.
This problem started when netty did a fix in the class ByteToMessageDecode (netty/netty@1d22fbd) that is extended by LimitedProtobufVarint32FrameDecoder. We would like to know if you have some insight on this issue.

@ajsutton

For the record I’m 99% certain the test is broken and was only passing because of the bug in netty which got fixed in 4.1.45.
The test actually writes tooLongMsg + validMsg in the first call to writeInbound and then writes validMsg again on the second call.
Whereas if you look at the test below it, it writes the first two bytes on the first call and then the second call explicitly slices off those two bytes for the second call to avoid writing them twice.
Prior to 4.1.45 the unread part of the first message was incorrectly discarded even though decode hadn’t read it, whereas with 4.1.45 it was preserved and so we do get the duplicate validMsg output.
The remaining worry I have is that the initial call to writeInbound fails because it throws TooLongFrameException but the remaining valid message is still hanging around waiting to be decoded. Writing the next message triggers parsing it but if it were the last message received I’m not sure what would trigger decoding it. That said, previously in this case we were discarding the remaining valid message in every case so in that sense we’re doing better than before…

@rolfyone

Yeh the reasoning for the duplication was perplexing, I wasn’t confident enough to call it broken but it’s certainly inconsistent

@ajsutton

The other odd thing is that the buffer we write the first time does look like it's only read up until the end of the first message but all of its data has now wound up being copied which is where the duplication comes from.

@rolfyone

i might have seen it wrong, it looked like it copied whatever hadn’t been read in the previous buffer, not the whole thing, but i could be wrong

@ajsutton

Yea that's correct it does. But the buffer we write to winds up with the reader index back before the start of the second message even though it's been read which is a bit odd.

@rolfyone

my reading of that was you had a new buffer created that had valid(1), valid(2) - the buffer we call on the second time already has teh pointer set past the invalid message… i guess its kinda academic

@Nashatyrev

The test is inconsistent: the ByteBuf passed for the first write shouldn’t be reused. It would normally be released after the call (the retain() in the test is to workaround that)
As a side note: the new Netty behavior doesn’t look 100% correct to me either: as soon as the BytesToMessageDecoder caches the whole buffer content the original buffer should be left empty and released upon returning from the method. This is just one lightweight retailedSlice() call when leaving the inbound buffer as internal cumulation.
(edited)

Anyways our LimitedProtobufVarint32FrameDecoder seems to work as expected. I’ll update Netty version and fix the test shortly

The remaining worry I have is that the initial call to writeInbound fails because it throws TooLongFrameException but the remaining valid message is still hanging around waiting to be decoded.
The final message is now decoded on either next message or channel close. It’s also kind of slightly weird but is under the responsibility of BytesToMessageDecoder and is kind of edge case. As far as I understand Teku never tries to recover from error on channel

Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Anton.

@Nashatyrev Nashatyrev merged commit b1adaca into libp2p:develop Oct 19, 2021
@Nashatyrev Nashatyrev deleted the feature/upgrade-netty branch August 16, 2022 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants