Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors for all RPC Requests #5867

Merged
merged 9 commits into from
Jun 7, 2024

Conversation

AgeManning
Copy link
Member

Sync now requires all RPC messages to either return a success or a failure. A disconnection doesn't count as a failure.

Prior to this PR, there were two known cases where an RPC request can get "lost", i.e no error response. These are:

  • A peer disconnects just after sync sends an RPC request message - These messages get dropped silently
  • A peer disconnects whilst we are self rate limiting our own RPC requests

This PR should remedy these cases. In the first case, we instantly send an error response when the peer has disconnected. In the second case, we now watch for disconnection messages and then report an error for all messages that we are currently holding due to self rate limiting.

In the process of doing this PR, I thought I needed an updated version of tokio-util in order to make the code cleaner. It turned out that wasn't the case, but I had already done the work to upgrade it. So I have left these changes in, which imo improve the code by removing a match case in most cases.

Copy link
Member

@jxs jxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM age

self.delayed_requests
.retain(|(map_peer_id, protocol), queue| {
if map_peer_id == &peer_id {
// NOTE: Currently cannot remove entries from the DelayQueue, we will just let
Copy link
Member

@jxs jxs May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking and I think we don't need ready_requests as when next_peer_request is ready we can just return it and keep returning as long as it's ready SelfRateLimiter will keep being polled. I can submit a subsequent PR removing it if you agree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an initial version of this that removed the delalyQueue and used a HashMapDelay. The problem I ran into was that typically there is a queue or backlog per (peer_id, protocol). When we insert them, the delay time is for the first in the queue. When I was using a HashMapDelay, they all expired really quickly which meant we had to keep requeuing them.

The current version just expires the first element in the queue then we check for the next element and queue that, rather than all elements in the queue, if this makes sense.

@michaelsproul michaelsproul mentioned this pull request May 31, 2024
michaelsproul added a commit that referenced this pull request May 31, 2024
Squashed commit of the following:

commit a3c8e01
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:38:35 2024 +1000

    Downgrade to 1.77

commit dbfde08
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:34:11 2024 +1000

    Bump rust version to 1.78

commit 2e4fe3f
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:26:50 2024 +1000

    Code improvement

commit 70e5326
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:05:00 2024 +1000

    Report errors for rate limited requests

commit 7b0e346
Author: Age Manning <[email protected]>
Date:   Thu May 30 13:25:22 2024 +1000

    Return and error if peer has disconnected
@michaelsproul michaelsproul mentioned this pull request May 31, 2024
Copy link
Collaborator

@dapplion dapplion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

michaelsproul added a commit that referenced this pull request Jun 3, 2024
Squashed commit of the following:

commit a3c8e01
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:38:35 2024 +1000

    Downgrade to 1.77

commit dbfde08
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:34:11 2024 +1000

    Bump rust version to 1.78

commit 2e4fe3f
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:26:50 2024 +1000

    Code improvement

commit 70e5326
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:05:00 2024 +1000

    Report errors for rate limited requests

commit 7b0e346
Author: Age Manning <[email protected]>
Date:   Thu May 30 13:25:22 2024 +1000

    Return and error if peer has disconnected
jimmygchen added a commit that referenced this pull request Jun 3, 2024
Squashed commit of the following:

commit 9699902
Author: dapplion <[email protected]>
Date:   Sun Jun 2 20:07:07 2024 +0200

    fix fmt

commit 05bddb9
Author: Age Manning <[email protected]>
Date:   Sun Jun 2 15:04:02 2024 +1000

    Update beacon_node/lighthouse_network/src/service/mod.rs

    Co-authored-by: João Oliveira <[email protected]>

commit a3c8e01
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:38:35 2024 +1000

    Downgrade to 1.77

commit dbfde08
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:34:11 2024 +1000

    Bump rust version to 1.78

commit 2e4fe3f
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:26:50 2024 +1000

    Code improvement

commit 70e5326
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:05:00 2024 +1000

    Report errors for rate limited requests

commit 7b0e346
Author: Age Manning <[email protected]>
Date:   Thu May 30 13:25:22 2024 +1000

    Return and error if peer has disconnected
michaelsproul pushed a commit that referenced this pull request Jun 4, 2024
Squashed commit of the following:

commit 9699902
Author: dapplion <[email protected]>
Date:   Sun Jun 2 20:07:07 2024 +0200

    fix fmt

commit 05bddb9
Author: Age Manning <[email protected]>
Date:   Sun Jun 2 15:04:02 2024 +1000

    Update beacon_node/lighthouse_network/src/service/mod.rs

    Co-authored-by: João Oliveira <[email protected]>

commit a3c8e01
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:38:35 2024 +1000

    Downgrade to 1.77

commit dbfde08
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:34:11 2024 +1000

    Bump rust version to 1.78

commit 2e4fe3f
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:26:50 2024 +1000

    Code improvement

commit 70e5326
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:05:00 2024 +1000

    Report errors for rate limited requests

commit 7b0e346
Author: Age Manning <[email protected]>
Date:   Thu May 30 13:25:22 2024 +1000

    Return and error if peer has disconnected
michaelsproul added a commit that referenced this pull request Jun 6, 2024
Squashed commit of the following:

commit 638a9bb
Author: realbigsean <[email protected]>
Date:   Wed Jun 5 18:11:20 2024 -0400

    update lockfile

commit 7b2cc6c
Merge: 9699902 7a7fc82
Author: realbigsean <[email protected]>
Date:   Wed Jun 5 18:11:03 2024 -0400

    Merge branch 'unstable' of https://github.com/sigp/lighthouse into rpc-peer-disconnect-error

commit 9699902
Author: dapplion <[email protected]>
Date:   Sun Jun 2 20:07:07 2024 +0200

    fix fmt

commit 05bddb9
Author: Age Manning <[email protected]>
Date:   Sun Jun 2 15:04:02 2024 +1000

    Update beacon_node/lighthouse_network/src/service/mod.rs

    Co-authored-by: João Oliveira <[email protected]>

commit a3c8e01
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:38:35 2024 +1000

    Downgrade to 1.77

commit dbfde08
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:34:11 2024 +1000

    Bump rust version to 1.78

commit 2e4fe3f
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:26:50 2024 +1000

    Code improvement

commit 70e5326
Author: Age Manning <[email protected]>
Date:   Thu May 30 17:05:00 2024 +1000

    Report errors for rate limited requests

commit 7b0e346
Author: Age Manning <[email protected]>
Date:   Thu May 30 13:25:22 2024 +1000

    Return and error if peer has disconnected
@michaelsproul
Copy link
Member

@Mergifyio queue

Copy link

mergify bot commented Jun 7, 2024

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at 7b48b0b

@mergify mergify bot merged commit 7b48b0b into sigp:unstable Jun 7, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants