Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block on Proof Courier Service Connection Attempt #1203

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ffranr
Copy link
Contributor

@ffranr ffranr commented Nov 19, 2024

This PR enhances the reliability of the proof courier service and the robustness of the proof transfer process:

  • Blocking Courier Service Connection: Ensures courier service connection attempts are blocking, preventing failures from premature connection usage and simplifying debugging.
  • Proof Transfer Resilience:
    • Implements logic to re-attempt proof transfers when backoff attempts are exhausted, avoiding delays until the next service restart.
    • Refines logging and error messages to improve debugging.

These updates strengthen the courier service's reliability and make proof transfers more fault-tolerant.

This commit ensures that the proof transfer ChainPorter state is
re-executed once proof transfer backoff attempts have been
exhausted. In the absence of this commit, the next opportunity for
re-attempting proof transfer would be when tapd restarts (pending
parcels are processed on startup).
@ffranr ffranr self-assigned this Nov 19, 2024
@coveralls
Copy link

coveralls commented Nov 19, 2024

Pull Request Test Coverage Report for Build 11917368837

Details

  • 0 of 60 (0.0%) changed or added relevant lines in 5 files are covered.
  • 35 unchanged lines in 10 files lost coverage.
  • Overall coverage decreased (-0.05%) to 40.994%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tapcfg/config.go 0 1 0.0%
itest/tapd_harness.go 0 2 0.0%
itest/test_harness.go 0 2 0.0%
tapfreighter/chain_porter.go 0 23 0.0%
proof/courier.go 0 32 0.0%
Files with Coverage Reduction New Missed Lines %
itest/test_harness.go 1 0.0%
proof/courier.go 1 9.33%
tapfreighter/chain_porter.go 1 0.0%
tappsbt/create.go 2 53.22%
commitment/tap.go 2 84.43%
asset/asset.go 2 80.61%
tapchannel/aux_leaf_signer.go 3 36.33%
tapgarden/caretaker.go 4 68.5%
tapdb/multiverse.go 7 60.32%
universe/interface.go 12 50.22%
Totals Coverage Status
Change from base Build 11914558450: -0.05%
Covered Lines: 25179
Relevant Lines: 61421

💛 - Coveralls

The change ensures that the courier service connection attempt is
blocking rather than synchronous. This prevents proof transfers from
failing due to attempts to use connections before they are fully
established, simplifying debugging.

Both the connection and transfer steps are part of the backoff
procedure, so failures in either step will trigger re-attempts.
This commit adds a new default value for the proof courier service
response timeout which was added in the previous commit.
Set the request timeout for the tapd harness universe courier service
to an appropriate value to ensure tests pass consistently.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 New
Development

Successfully merging this pull request may close these issues.

2 participants