Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix enr request order #4179

Merged
merged 9 commits into from
Aug 3, 2022
Merged

Conversation

matkt
Copy link
Contributor

@matkt matkt commented Jul 27, 2022

Signed-off-by: Karim TAAM [email protected]

PR description

After a Geth<>Besu peerring test we found that sometimes we had "ENR request failed id=68f8eb309602c1a6 addr=3.16.90.60:30303 err=“RPC timeout”

Besu does not respond to the ENR request. By investigating we found that sometimes ENR_REQUEST comes before the PONG and so Besu ignores the request.

This PR allows to cache the request if it arrives before the PONG to answer once the bond is done

Fixed Issue(s)

Documentation

  • I thought about documentation and added the doc-change-required label to this PR if
    updates are required.

Changelog

Signed-off-by: Karim TAAM <[email protected]>
Signed-off-by: Karim TAAM <[email protected]>
Copy link
Contributor

@jflo jflo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a sensible approach, one minor improvement suggested.

@matkt matkt marked this pull request as ready for review July 28, 2022 09:54
@matkt matkt requested review from jflo and mbaxter July 28, 2022 13:16
@iamhsk iamhsk added the TeamChupa GH issues worked on by Chupacabara Team label Aug 1, 2022
@@ -111,6 +111,8 @@ public class PeerDiscoveryController {
private final PeerTable peerTable;
private final Cache<Bytes, DiscoveryPeer> bondingPeers =
CacheBuilder.newBuilder().maximumSize(50).expireAfterWrite(10, TimeUnit.MINUTES).build();
private final Cache<Bytes, Packet> cachedEnrRequests =
CacheBuilder.newBuilder().maximumSize(50).expireAfterWrite(30, SECONDS).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs on CacheBuilder recommend migrating to Caffeine. This is probably something to look into across the codebase - maybe for a follow-up PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 30 seconds overly generous? Maybe 10 seconds?

@@ -314,9 +316,12 @@ public void onMessage(final Packet packet, final DiscoveryPeer sender) {
matchInteraction(packet)
.ifPresent(
interaction -> {
System.out.println("ici");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray println

@@ -1371,6 +1371,45 @@ public void shouldNotRespondToENRRequestForNonBondedPeer() {
.send(eq(peers.get(0)), matchPacketOfType(PacketType.ENR_REQUEST));
}

@Test
public void shouldNotRespondAndCacheENRRequestForBondingPeer() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name is a bit confusing to me, what about:

Suggested change
public void shouldNotRespondAndCacheENRRequestForBondingPeer() {
public void shouldRespondToRecentENRRequestAfterBonding() {

final Packet pongPacket =
Packet.create(PacketType.PONG, pongRequestPacketData, nodeKeys.get(0));

controller.onMessage(enrPacket, peers.get(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd verify here that we did not respond to the enr request directly with something like:

verify(outboundMessageHandler, never())
        .send(any(), matchPacketOfType(PacketType.ENR_RESPONSE));

@@ -1371,6 +1371,45 @@ public void shouldNotRespondToENRRequestForNonBondedPeer() {
.send(eq(peers.get(0)), matchPacketOfType(PacketType.ENR_REQUEST));
}

@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) Might be nice to add another test that checks that we don't respond to an ENR request received way before a PONG message. Could do this by injecting a CacheBuilder with a customer ticker that can be advanced from the test.

Copy link
Contributor

@pinges pinges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! Proposed minor change for comment.

} else if (PeerDiscoveryStatus.BONDING.equals(peer.getStatus())) {
LOG.trace("ENR_REQUEST cached for bonding peer Id: {}", peer.getId());
// it may happen that we receive the ENR_REQUEST just before the PONG.
// Because peers want to send the ENR_REQUEST directly after the pong.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to mention that this can happen because discovery is using UDP?

@jflo jflo requested review from mbaxter and pinges August 2, 2022 21:16
@pinges pinges merged commit b1bff9c into hyperledger:main Aug 3, 2022
eum602 pushed a commit to lacchain/besu that referenced this pull request Nov 3, 2023
* fix enr request
* add tests
* removed errant debug
* adds pr feedback and test to ensure ENR responses never go out to incomplete ping/pong handshakes
Signed-off-by: Karim TAAM <[email protected]>
Co-authored-by: Justin Florentine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mainnet peering TeamChupa GH issues worked on by Chupacabara Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants