-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chain: request pruned blocks from backend peers #737
Conversation
cc @Roasbeef |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completed an initial pass. Stopped short when I realized how much code we can dave here by re-using the existing query manager we wrote for neutrino. What's stopping us from re-using that package? it would save a lot of code, and we'd be able to focus our testing to a single package and ideally improve that single site which'll benefit both btcwallet and neutrino independently.
If the it's to be used also outside of Neutrino it could be it needs to be abstracted in a way, but the |
@Roasbeef @halseth this is ready for a proper look now -- I've re-written it to use Neutrino's query API. The branch isn't rebased on master, but rather the latest tagged version, on purpose to test the changes independently. I've tested this on both mainnet and testnet without issues. I have a WIP branch for lnd to do this, but it depends on a global daemon-level block cache (being worked on by an external contributor) to work properly, otherwise bitcoind peers stall during initial graph sync as we're sending them multiple requests for the same block. |
Ahh interesting find! I wonder if the same thing happens on the neutrino side since I've observed at times that requests to fetch blocks aren't properly de-duplicated. Isn't it the case though that w/ the current query API stuff, we'll eventually rotate that out to another peer due to a timeout? Or is it that we only use a single peer right now so there's no actual peer to rotate out to? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super happy we were able to re-use the query manager from neutrino in the end. I've completed an initial pass and plan to spin up a new testnet bitcoind node to give things a spin first hand.
I looked into this further and there was actually a bug in my changes when handling concurrent requests for the same block. I've addressed it in the latest diff and the block cache is no longer necessary to have this functionality work properly. You should be able to test with lightningnetwork/lnd#5154. |
cc @carlaKC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this turned out to be pretty clean, very easy to follow changes and written in a very testable way! 👍
A few comments and suggestions, when those are fixed this LGTM ✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few questions from me, missing some context. Great tests for this one!
Tested out with the lnd #5154 and the issue described in #2997 (lnd goes down, channel closed, blocks pruned) and closed channel is detected 🎉
// | ||
// NOTE: This method exists to satisfy the query.Peer interface. | ||
func (p *queryPeer) SubscribeRecvMsg() (<-chan wire.Message, func()) { | ||
return p.msgsRecvd, func() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If have a cancellation closure indicating that we're no longer reading from msgsRecvd
, won't the OnRead
callback created in newQueryPeer
block? I'm probably missing some context as to how this subscription is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lifetime of the subscription is bound to the lifetime of the queryPeer
struct and its corresponding query.Worker
spawned by the query.WorkManager
, so the closure returned only gets called after the peer disconnects.
The closure exists as part of the API because of how neutrino handles message subscriptions, as it needs to clean up some state internally. This isn't needed here, which is why an empty closure is returned.
Needs a rebase to clear the |
I quickly skimmed this diff and didn't see validation of the block's contents. E.g., I assume you have the header hash of a block on the best chain from your trusted bitcoind and you use that hash to request that block from an untrusted peer. I'm guessing your code from neutrino knows to validate the block's implicit merkle tree against the explicit merkle root. What might not be obvious is that there's a number of ways to create a valid block whose merkle tree contains ambiguities that could be exploited to send you invalid block contents (e.g. I'm aware of CVE-2012-2459, internal nodes as a tx, and CVE-2017-12842; there may be more). This isn't an issue for LND when it receives blocks from a trusted bitcoind because modern full nodes are resistant to all of those attacks; it also isn't an issue (IMO) for Neutrino when used as a lightweight client because there are cheaper ways to attack SPV security; but it could represent a reduction in security for full-node-backed LND instances using this PR. Happily, it has an easy fix: even when a block has been pruned, bitcoind keeps a record of its transaction count. If you ensure that both bitcoind and your LND/neutrino agree about the number of transactions, you prevent any attacks that reinterpret a tx as a merkle node or vice-versa. E.g.:
Related: |
@harding that's a really good point harding! Thankfully we can re-use |
@Roasbeef checkBlockSanity looks pretty good to me. It appears to be explicitly designed to be impervious to CVE-2012-2459. I think it also makes internal node re-interpretation computationally infeasible, although I'd have to diagram that out to be sure. |
This exposes the new AllowSelfConns config option allowing external testing of peer.Peer.
To minimally support wallets connected to pruned nodes, we add a new subsystem that can be integrated with chain clients to request blocks that the server has already pruned. This is done by connecting to the server's full node peers and querying them directly. Ideally, this is a capability supported by the server, though this is not yet possible with bitcoind.
At the moment, this is only done for the BitcoindClient, as the other backends don't support block pruning.
Thanks for pointing this out @harding! I've updated the PR to validate them and added a test case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🌵
To minimally support wallets connected to pruned nodes, we add a new subsystem that can be integrated with chain clients to request blocks that the server has already pruned. This is done by connecting to the server's full node peers and querying them directly. Ideally, this is a capability supported by the server, though this is not yet possible with bitcoind.