Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate packets on start #2256

Merged

Conversation

seanchen1991
Copy link
Contributor

@seanchen1991 seanchen1991 commented Jun 1, 2022

Closes: #2093

Description

This PR changes the handle_packet_cmd flow so that it initiates the clearing of packets upon receiving a batch of IbcEvents.

The issue seems to have originated as a result of some non-determinism in how the relayer subscribes to chains. The usual expected flow is that upon subscribing to a chain, a NewBlock command is received at some height h, followed by an IbcEvents command also at height h. However, sometimes the NewBlock command at height h would be missed and the relayer would only receive the IbcEvents command at height h. This command would be processed at height h, and then later a subsequent NewBlock command at associated with height h + 1 would arrive, which would trigger a clear_pending_packets flow at height h, which resulted in a "packet messages are redundant" error since the relayer would attempt to clear the IbcEvents at height h, but they had already been cleared.

The key insight was realizing that the relayer needs to clear pending packets regardless of whether the NewBlock command arrives before the IbcEvents command or vice versa. Regardless of the ordering of these two commands, pending packets at height h - 1 and before need to be cleared once on startup if the relayer has been configured to do so.


PR author checklist:

  • Added changelog entry, using unclog.
  • Added tests: integration (for Hermes) or unit/mock tests (for modules).
  • Linked to GitHub issue.
  • Updated code comments and documentation (e.g., docs/).

Reviewer checklist:

  • Reviewed Files changed in the GitHub PR explorer.
  • Manually tested (in case integration/unit/mock tests are absent).

@plafer
Copy link
Contributor

plafer commented Jun 1, 2022

Can we add a test here? Seems like the kind of bug that can resurface easily after a refactor

@ancazamfir
Copy link
Collaborator

ancazamfir commented Jun 2, 2022

Looked a bit through the logs and what happens is that depending on the event registration, hermes may miss the NewBlock for h but get an IbcEvent for h. So now it will process the events at h and on NewBlock for h+1 it will clear for h (sending duplicate packets).

The fix is to clear packets at h-1 either on NewBlock or IbcEvent command (with height h) from supervisor whichever occurs first.
Note: this would be only when is_first_run is true. For periodic packet clearing we still want to do it on NewBlock event only.

@ancazamfir
Copy link
Collaborator

You may want to wait for @soareschen's PR #2238 as the changes there will affect this PR and you can make use the refactoring there.

relayer/src/worker/packet.rs Outdated Show resolved Hide resolved
relayer/src/worker/packet.rs Show resolved Hide resolved
Copy link
Collaborator

@ancazamfir ancazamfir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good, thanks @seanchen1991! some commented out code needs to be cleaned. I will approve as I am off tomorrow.

@seanchen1991 seanchen1991 merged commit dbb66d3 into informalsystems:master Jun 16, 2022
@seanchen1991 seanchen1991 deleted the duplicate-packets-on-start branch June 16, 2022 20:47
hu55a1n1 pushed a commit to hu55a1n1/hermes that referenced this pull request Sep 13, 2022
* Pull in upstream changes

* Remove unnecessary destructuring assignment

* Decrement height when scheduling packet clearing

* Add changelog entry

* Clarify comment

* Remove unnecessary destructure

* Initiate clear packet flow when a IbcEvent is received

* Revert to more straightforward logic to see if CI still breaks

* Refactor handle_packet_cmd

* Clean up handle_packet_cmd

* Update handle_packet_cmd doc comment

* Incorporate PR feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate packets on start
5 participants