Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tx_memory_pool: make double spends a no-drop offense #9218

Merged
merged 1 commit into from
Mar 8, 2024

Conversation

jeffro256
Copy link
Contributor

Nodes who see different txs in a double spend attack will drop each other, splitting the network. Issue found by @Boog900.

Nodes who see different txs in a double spend attack will drop each other, splitting the network.
Issue found by @Boog900.
@UkoeHB
Copy link
Contributor

UkoeHB commented Mar 7, 2024

Doesn't this change make it easier to do large-scale double-spend attacks, because now nodes that originate double-spend attacks will stay connected (and hence have higher throughput)?

@SChernykh
Copy link
Contributor

SChernykh commented Mar 7, 2024

Nodes that originate double spends don't get disconnected even with the old code because their "neighbours" haven't seen the other transaction as well.

The only nodes that suffer from disconnects are the ones where both "tx wavefronts" meet, and those are not the attacker's nodes in 99% of cases.

@jeffro256
Copy link
Contributor Author

Let's say that we have 10 nodes with connections:

A <-> B <-> C <-> D <-> E <-> F <-> G <-> H <-> I <-> J

The attacker can start propagating tx T1 at node A and double-spend tx T2 at node J. The flow of txs over time could look like this:

T1                                                    T2
A <-> B <-> C <-> D <-> E <-> F <-> G <-> H <-> I <-> J
---------------------------------------------------------------
T1    T1                                        T2    T2
A <-> B <-> C <-> D <-> E <-> F <-> G <-> H <-> I <-> J
---------------------------------------------------------------
T1    T1    T1                            T2    T2    T2
A <-> B <-> C <-> D <-> E <-> F <-> G <-> H <-> I <-> J
---------------------------------------------------------------
T1    T1    T1    T1                T2    T2    T2    T2
A <-> B <-> C <-> D <-> E <-> F <-> G <-> H <-> I <-> J
---------------------------------------------------------------
T1    T1    T1    T1    T1    T2    T2    T2    T2    T2
A <-> B <-> C <-> D <-> E <-> F <-> G <-> H <-> I <-> J
---------------------------------------------------------------

At this point E and F will drop each other, despite not being close to the origin of the double spend, and we have 2 split networks. In the real world, relationships are much more complicated, so it won't be this easy, but the point remains.

@jeffro256
Copy link
Contributor Author

We could implement a long term mitigation against spamming double spends by keeping a map of key images -> txid per host and checking if if 1 host sent 2 different txs that share 1 key image. Then we could block them

@luigi1111 luigi1111 merged commit c23951f into monero-project:master Mar 8, 2024
18 checks passed
@jeffro256 jeffro256 deleted the no_drop_on_double_spend branch July 9, 2024 18:36
@jeffro256
Copy link
Contributor Author

Hackerone report 2590695 reaffirms why this code change is a good decision.

(report is private as of time of writing)

@Rucknium
Copy link

Rucknium commented Jul 31, 2024

Network topology discovery

TL;DR: This PR is the right thing to do, but it enables a difficult-to-execute method for detecting which Monero peers are connected to each other, i.e. the p2p network topology. If an adversary knows the p2p network topology, they have a higher probability of detecting the IP origin of transactions even if Dandelion++ is enabled.

Franzoni, Salleras, & Daza (2022) list some topology-discovery methods for the BTC network. Most of them are patched for Monero or are incompatible with Monero's transaction-confirmation protocol. In graph theory, a connection from one node to another is called an "edge".

Biryukov, Khovratovich, & Pustogarov (2014) and Miller et al. (2015) both use the a "last seen" timestamp field of nodes' peer lists to estimate the bitcoin network's edges. This timestamp was removed from bitcoin node code in 2015. A similar field was removed from the Monero daemon in 2019 in PR #5481 / #5682 after Cao et al. (2019) was released.

Neudecker, Andelfinger, & Hartenstein (2016) used bitcoin's pre-2015 "trickle" transaction propagation protocol to estimate the network topology. Their technique is not effective against bitcoin's newer "diffusion" protocol.

Delgado-Segura et al. (2019) uses unconfirmed child-parent transactions in the mempool to estimate the network topology. Spending unconfirmed Monero outputs is not possible. Anyway, the issue was patched in BTC after the paper was released.

Grundmann, Neudecker, & Hartenstein (2019) describe the only methods that could still be effective against BTC (and Monero). The first analyzes accumulation of transactions in gossip messages. In practice, it is not very effective. The second uses the network's response to double-spend transactions:

Their method is a type of canary trap/barium meal test. A setup of a canary trap could look like this. The head of a government agency wants to know which employee has been leaking information to the press. The head gives different, unique versions of a story to each employee. The next day, the story appears in the newspapers. The details of the story that were published can be matched back to the story that was given to a specific employee, who is then fired.

The double-spend method of Grundmann, Neudecker, & Hartenstein (2019) works like this:

  1. The adversary connects to nodes on the network. For best results, the adversary would connect to every node.
  2. Prepare a unique transaction for every node except for the target node that the adversary wants to discover the peers of. These transactions all spend the same output (the ring's key image is the same), but they spend to a different address or have other differences so their transaction hashes are all distinct.
  3. Send all the transactions simultaneously to the non-target nodes. The target node will only accept one of the double-spend transactions. Since the non-target nodes won't accept nor relay the double-spend transactions of their peers, when the target node gets one of the double-spend transactions, it must have been through a direct connection to one of the non-target nodes.
  4. The target node will relay one of the double-spend transactions to the adversary's node. The transaction hash of that transaction tells the adversary which of the non-target nodes had a connection to the target node.

This method is difficult to scale because the adversary needs to construct a transaction for each node on the network to discover just one edge in the graph. According to some node log data, Monero p2p connections are short-lived. The median length of a connection is 45 minutes. The information collected by this double-spend method would be out of date quickly.

Before PR #9218, an adversary that attempted this method would disrupt the network's topology while trying to measure it. Most nodes would disconnect from each other (and try new connections) at step (3). After PR #9218, the double-spend broadcasting would not alter the network topology.

How much does p2p network topology discovery help an adversary link a transaction to an IP?

First, consider only the fluff phase of transaction propagation where "diffusion" is used. Fanti & Viswanath (2017) describe a first-timestamp estimator that does not use graph edge information and a reporting-centrality estimator that does use it. They say "Neither of the lower bounds from the first-timestamp or reporting centrality estimators strictly outperforms the other. The first-timestamp estimator performs better on graphs with low degree d [i.e. a small number of peer connections], whereas reporting centrality performs better in the high-d regime." When an adversary has only one connection to each honest node, reporting centrality has higher probability of detecting the true source of a transaction when the number of connections of each peer is 10 and higher (Figure 6 of Fanti & Viswanath (2017)).

Consider the Dandelion++ protocol during the stem phase. When the fraction of spy nodes is 15%, the first-spy estimator that does not use any topology information has 5% precision. The max-weight estimator that uses topology information has 10% precision. This is in Figure 3 of Fanti et al. (2018).

EDIT 23 Aug 2024 @Boog900 corrected me on which network topology information is used by an adversary for the max-weight estimator against Dandelion++. The adversary needs topology information about the stem-phase 4-regular private subgraph, not just the overall p2p graph. It appears that Sharma, Gosain, & Diaz (2022) uses knowledge of the overall p2p graph as an intermediate step to estimate the Dandelion++ private subgraph when a large number of transactions are broadcasted.

References

Biryukov A, Khovratovich D, Pustogarov I (2014) "Deanonymisation of clients in bitcoin P2P network."

Cao T, Yu J, Decouchant J, Luo X, & Verissimo P (2019) "Exploring the Monero Peer-to-Peer Network"

Delgado-Segura S, Bakshi S, Pérez-Solà C, Litton J, Pachulski A, Miller A, Bhattacharjee B (2019) "TxProbe: Discovering Bitcoin’s Network Topology Using Orphan Transactions"

Fanti G & Viswanath P (2017) "Anonymity Properties of the Bitcoin P2P Network"

Fanti, G., Venkatakrishnan, S. B., Bakshi, S., Denby, B., Bhargava, S., & Miller, A., Viswanath P (2018). "Dandelion++: Lightweight cryptocurrency networking with formal anonymity guarantees."

Franzoni, F., Salleras, X. & Daza, V. (2022) "AToM: Active topology monitoring for the bitcoin peer-to-peer network."

Grundmann M, Neudecker T, Hartenstein H (2019) "Exploiting transaction accumulation and double spends for topology inference in bitcoin."

Miller A.K., Litton J., Pachulski A., Gupta N., Levin D., Spring N., & Bhattacharjee B. (2015). "Discovering Bitcoin's Public Topology and Influential Nodes."

Neudecker T, Andelfinger P, Hartenstein H (2016) "Timing analysis for inferring the topology of the bitcoin peer-to-peer network."

Sharma, P. K., Gosain, D., & Diaz, C. 2022. On the anonymity of peer-to-peer network anonymity schemes used by cryptocurrencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants