Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addrmgr: make safe accessors for addresses #1760

Closed
wants to merge 1 commit into from

Conversation

chappjc
Copy link
Contributor

@chappjc chappjc commented Oct 16, 2021

This resolves a data race I have observed with both stock btcwallet run in spv mode, and a different app using neutrino's ChainService. The races are at the end of this PR description. The explanation and resolution are as follows.

NOTE: An alternative resolution that adds a mutex to KnownAddress is proposed in #1763


The KnownAddress returned by (*AddrManager).GetAddress is not safely usable while the AddrManager is in use by another goroutine since the fields of the KnownAddress are mutable. Although AddrManager treats a *wire.NetAddress as immutable, the KnownAddress.na field itself is reassigned. Similarly, the lastattempt field is mutable.

To address this API usability issue, this commit makes two new accessor methods:

  • GetNetAddress returns the *wire.NetAddress and the lastAttempt
  • GetAddressCopy returns a copy of the *KnownAddress. This is not used in the fixed newAddressFunc, but I've created it for any previous GetAddress consumers that do want a full KnownAddress for whatever reason.

The original GetAddress remains because it is designed to reference the KnownAddress that is still in the AddrManager, allowing operations by the AddrManager such as connects and attempts to be reflected in the initially returned KnownAddress. The blackbox addrmgr package tests are are designed around this behavior.

However, it is not safe to use GetAddress from another goroutine such as the connmgr's GetNewAddress function (the newAddressFunc closure defined in newServer) concurrent with updates to the KnownAddress such as via (*Peer).inHandler, which has callbacks to the AddrManager that may mutate the KnownAddress.

An alternate resolution would involve adding a mutex to a KnownAddress, but this would be a more substantial change, which creates unnecessary lock contention and complexity. Returning copies or just the needed fields is safer and simpler.


While the following data races are using neutrino's ChainService, the root cause traces back to the btcd API issue above. In both cases it comes down to use of the KnownAddress returned by GetAddress in the newAddressFunc closure, concurrent with a write elsewhere.

Data race with custom spv wallet app using neutrino.
==================
WARNING: DATA RACE
Write at 0x00c0004251a0 by goroutine 54:
  github.com/btcsuite/btcd/addrmgr.(*AddrManager).updateAddress()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/addrmgr/addrmanager.go:189 +0x3d5
  github.com/btcsuite/btcd/addrmgr.(*AddrManager).AddAddresses()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/addrmgr/addrmanager.go:605 +0xfe
  github.com/lightninglabs/neutrino.(*ServerPeer).OnAddr()
      /home/jon/go/pkg/mod/github.com/lightninglabs/[email protected]/neutrino.go:378 +0x6f2
  github.com/lightninglabs/neutrino.(*ServerPeer).OnAddr-fm()
      /home/jon/go/pkg/mod/github.com/lightninglabs/[email protected]/neutrino.go:318 +0x4d
  github.com/btcsuite/btcd/peer.(*Peer).inHandler()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/peer/peer.go:1399 +0x1403
  github.com/btcsuite/btcd/peer.(*Peer).start·dwrap·5()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/peer/peer.go:2158 +0x39

Previous read at 0x00c0004251a0 by goroutine 242:
  github.com/btcsuite/btcd/addrmgr.(*KnownAddress).NetAddress()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/addrmgr/knownaddress.go:28 +0x1da
  github.com/lightninglabs/neutrino.NewChainService.func3()
      /home/jon/go/pkg/mod/github.com/lightninglabs/[email protected]/neutrino.go:773 +0x1e6
  github.com/btcsuite/btcd/connmgr.(*ConnManager).NewConnReq()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/connmgr/connmanager.go:391 +0x347
  github.com/btcsuite/btcd/connmgr.(*ConnManager).Start·dwrap·7()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/connmgr/connmanager.go:530 +0x39

Goroutine 54 (running) created at:
  github.com/btcsuite/btcd/peer.(*Peer).start()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/peer/peer.go:2158 +0x477
  github.com/btcsuite/btcd/peer.(*Peer).AssociateConnection.func1()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/peer/peer.go:2193 +0x34

Goroutine 242 (running) created at:
  github.com/btcsuite/btcd/connmgr.(*ConnManager).Start()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/connmgr/connmanager.go:530 +0x1ed
  github.com/lightninglabs/neutrino.(*ChainService).Start·dwrap·21()
      /home/jon/go/pkg/mod/github.com/lightninglabs/[email protected]/neutrino.go:1541 +0x39
==================
Data race with stock btcwallet in spv mode (using neutrino). Partially incomplete dump it seems.
==================
WARNING: DATA RACE
Read at 0x00c00044c8b8 by goroutine 94:
  runtime.racereadrange()
      <autogenerated>:1 +0x1b
  github.com/btcsuite/btcd/connmgr.(*ConnManager).NewConnReq()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/connmgr/connmanager.go:391 +0x347
  github.com/btcsuite/btcd/connmgr.(*ConnManager).Start·dwrap·7()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/connmgr/connmanager.go:530 +0x39

Previous write at 0x00c00044c8b8 by goroutine 80:
  github.com/btcsuite/btcd/addrmgr.(*AddrManager).Attempt()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/addrmgr/addrmanager.go:861 +0x197
  github.com/lightninglabs/neutrino.NewChainService.func3()
      /home/jon/go/pkg/mod/github.com/lightninglabs/[email protected]/neutrino.go:815 +0x564
  github.com/btcsuite/btcd/connmgr.(*ConnManager).NewConnReq()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/connmgr/connmanager.go:391 +0x347
  github.com/btcsuite/btcd/connmgr.(*ConnManager).Start·dwrap·7()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/connmgr/connmanager.go:530 +0x39

Goroutine 94 (running) created at:
  github.com/btcsuite/btcd/connmgr.(*ConnManager).Start()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/connmgr/connmanager.go:530 +0x1ed
  github.com/lightninglabs/neutrino.(*ChainService).Start·dwrap·21()
      /home/jon/go/pkg/mod/github.com/lightninglabs/[email protected]/neutrino.go:1541 +0x39

Goroutine 80 (running) created at:
  github.com/btcsuite/btcd/connmgr.(*ConnManager).Start()
      /home/jon/go/pkg/mod/github.com/btcsuite/[email protected]/connmgr/connmanager.go:530 +0x1ed
  github.com/lightninglabs/neutrino.(*ChainService).Start·dwrap·21()
      /home/jon/go/pkg/mod/github.com/lightninglabs/[email protected]/neutrino.go:1541 +0x39
==================

I also have updates ready to neutrino and btcwallet that use this change.

The KnownAddress returned by GetAddress is not safely usable while the
AddrManager is in use by another goroutine since the fields of
the KnownAddress are mutable. Although AddrManager treats a
*wire.NetAddress as immutable, the KnownAddress.na field itself
is reassigned. Similarly, the lastattempt field is mutable.

To address this API usability issue, this commit makes two new accessor
methods:

- GetNetAddress returns the *wire.NetAddress and the lastAttempt
- GetAddressCopy returns a copy of the *KnownAddress

The original GetAddress remains because it is designed to reference
the KnownAddress that is still in the AddrManager, allowing operations
by the AddrManager such as connects and attempts to be reflected in the
initially returned KnownAddress. The blackbox addrmgr tests are
are designed around this behavior.

However, it is not safe to use GetAddress from another goroutine such
as the connmgr's GetNewAddress function (newAddressFunc closure defined
in newServer) concurrent with updates to the KnownAddress such as via
(*Peer).inHandler, which has callbacks to the AddrManager that may
mutate the KnownAddress.

An alternate resolution would involve adding a Mutex to a KnownAddress,
but this would be a more substantial change, which would likely create
unnecessary lock contention and complexity. Returning copies is safer
and simpler.
@chappjc
Copy link
Contributor Author

chappjc commented Oct 18, 2021

I've noted that dcrd chose to add a Mutex to KnownAddress to solve the data race. decred/dcrd@ad33b0e

I'd be happy to take that approach as well, but my feeling was that would add a lot of unnecessary locking when the AddrManager's mutex is already doing that job. And it will be error prone as it is necessary to stay on top of the lock's use all throughout AddrManager

@coveralls
Copy link

Pull Request Test Coverage Report for Build 1349537398

  • 3 of 29 (10.34%) changed or added relevant lines in 2 files are covered.
  • 14 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.03%) to 52.939%

Changes Missing Coverage Covered Lines Changed/Added Lines %
server.go 0 7 0.0%
addrmgr/addrmanager.go 3 22 13.64%
Files with Coverage Reduction New Missed Lines %
server.go 1 0%
database/ffldb/blockio.go 4 92.62%
peer/peer.go 9 75.65%
Totals Coverage Status
Change from base Build 1308902808: -0.03%
Covered Lines: 21084
Relevant Lines: 39827

💛 - Coveralls

@jcvernaleo
Copy link
Member

Closing this in favor of #1763

@jcvernaleo jcvernaleo closed this Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants